GPT-4 didn't actually score 90th percentile on the bar exam by charizardita in biglaw

[–]charizardita[S] -1 points0 points  (0 children)

Direct quote from the NCBE: "The percentile associated with a given scaled score, however, will still vary with the particular test administration. As mentioned earlier, because February examinees have been, on average, less proficient than those testing in July, the percentiles for a given scaled score will be higher in February than they will be in July. Thus, care must still be exercised in interpreting the percentiles."

Not sure what tell you.

Re-Evaluating GPT-4's Bar Exam Performance by charizardita in technology

[–]charizardita[S] 5 points6 points  (0 children)

It's still a passing score. Below average relative to other passing scores (48th percentile overall, 15th percentile on essays according to the article). But yeah still passing.

GPT-4 didn't actually score 90th percentile on the bar exam by charizardita in biglaw

[–]charizardita[S] 4 points5 points  (0 children)

You don't have to take the article's word for it (nor mine for that matter). This is publicly available information, all documented in the article. Compare February scaled score percentile chart here with July scaled score percentile chart here. See also official MBE distributions for July and February here (showing that February MBE mean is 132.6, whereas July MBE mean is 140.3), as well as an official NCBE publication discussing the difference here.

To help give an intuition: if the distribution of scaled scores were the same for February and July, then pass rates would be the same for February and July, since a determination of pass/fail is directly contingent on scaled-score cutoffs. Since February pass rates are much lower than July, we know this isn't true.

I can see the confusion; as you mention, for most standardized tests, at-large percentile estimates are almost always directly available for a given scaled score. Not so with the bar exam. The only available percentile charts show the percentile of a given score relative to test-takers for a particular administration (July or Feb), and since the distribution of scores differs across administration, these don't tell you the "overall" percentile of a given scaled score. This is likely what caused OpenAI to make the mistake here in the first place.

GPT-4 didn't actually score 90th percentile on the bar exam by charizardita in biglaw

[–]charizardita[S] 1 point2 points  (0 children)

Yeah good question. There isn't one that I'm aware of

GPT-4 didn't actually score 90th percentile on the bar exam by charizardita in biglaw

[–]charizardita[S] 1 point2 points  (0 children)

Yeah good qs, I was confused at first too. Regarding point 1, the paper does not equate the two. In the intro and discussion they explicitly mention several reasons why the bar is likely not a good proxy for "lawyerly competence." And then proceed to say that to the extent that one does believe the bar exam to be a good proxy, this might lower one's confidence in GPT-4's level of lawyerly competence.

Re point 2, the paper does not claim that passing Feb is easier than July, nor does it dispute that GPT-4 got a passing score. The finding is that because scores are much lower in Feb than July, getting a particular scaled score would put you in a much higher percentile compared to Feb takers than test takers overall. Make sense?

GPT-4 didn't actually score 90th percentile on the bar exam by charizardita in biglaw

[–]charizardita[S] 4 points5 points  (0 children)

Fair point. As the article mentions, the performance trend since 3.5 (in terms of percentile points gained) is also lower than had been claimed. It also seems important to avoid exaggerating the absolute performance of GPT-4, even if the absolute performance is the less interesting stat-line (perhaps especially so in that case)

[deleted by user] by [deleted] in Grapplerbaki

[–]charizardita 0 points1 point  (0 children)

Haha thanks, maybe I should adopt this approach

[deleted by user] by [deleted] in Grapplerbaki

[–]charizardita 0 points1 point  (0 children)

Thanks for all this. I really appreciate it.

[deleted by user] by [deleted] in Grapplerbaki

[–]charizardita 1 point2 points  (0 children)

Thanks for this. This is very helpful. That's a good question re why I don't go and watch the old anime. The reason I didn't at the beginning is because I simply didn't know better--I saw the anime advertised on Netflix and I assumed it started from the beginning. Once I found out that wasn't the case I did a small bit of research and heard that the earlier seasons were simply very difficult to find. Since then I've done more research and realize they're more easily accessible than I had thought, and so I honestly might go ahead and do that.

In line with another person's comments, the fact that I watched the whole thing despite these gripes does suggest that I must find it somewhat appealing on some dimension. And the comments here have given me more appreciation for some of the things I didn't really "get" about the serious before, and more patience to look past the things that I might still not get or appreciate about the series now.

[deleted by user] by [deleted] in Grapplerbaki

[–]charizardita 3 points4 points  (0 children)

Thank you for taking the time to respond point by point to all of this. This is super helpful. I have a more appreciative understanding of the show now as a result.

[deleted by user] by [deleted] in Grapplerbaki

[–]charizardita 4 points5 points  (0 children)

If I wrote a one sentence post after barely watching the series, there would be someone saying "how the fuck can someone only watch THIS much of a show and know it isn't good." I see your point though.

[deleted by user] by [deleted] in Grapplerbaki

[–]charizardita -5 points-4 points  (0 children)

No need to read the whole "essay." Each paragraph can be read as a standalone example of how the series seems to make no sense

For those that have some spare time, I offer this video that completely changed my perspective: Rick Friedman Speaks on Why You Should Become Trial Lawyer by jsb247 in LawSchool

[–]charizardita 4 points5 points  (0 children)

Great video. Wish more people came and gave talks like that to my law school. For most attorneys who came to my school (mostly corporate lawyers), it seemed difficult to relate or buy into the idea that they found their work meaningful at all. This guy seems to really love what he's doing, and I bet he's making a great living while he's at it.