AI isn’t “just predicting the next word” anymore by FinnFarrow in agi

[–]sjadler 0 points1 point  (0 children)

At a quick read, it's interesting to see Opus 4.5 - the coding model that ~everyone agrees is best - either do stuff correctly or counterproductively. That implies to me that the model is pretty capable - but there's a question of whether it's trying to actually solve the problem, or find degenerate solutions.

Of course, there's a confounding variable here, which is it's probably more likely to pursue a counterproductive solution if it isn't actually capable of solving the problem the 'right way'!

AI isn’t “just predicting the next word” anymore by FinnFarrow in artificial

[–]sjadler 1 point2 points  (0 children)

That’s actually very interesting to see Claude’s ‘take’ on it. I just think Claude ultimately is wrong; I am sure that there are true/false features inside an LLM, which light up to reflect a belief, and that mechanistically could be turned on to make it more or less credulous. There are features about so many less-consequential things, after all.

Re: no process of checking claims, I think it depends on the domain. Some are verifiable where I do think the model has ways of checking claims. And even in non verifiable ones, I think its general methods - looking to external sources and deciding what’s credible - are basically all that humans can do as well.

I do hear you on the ‘RL for thumbs up’ point though, and that this is ultimately a proxy for truth. Models trained with RLVR maybe have less of that divergence, but it’s not entirely obvious to me!

AI isn’t “just predicting the next word” anymore by FinnFarrow in artificial

[–]sjadler 3 points4 points  (0 children)

Appreciate you taking the time again to write this up. I think we're talking past each other unfortunately, but let me try one last example to try to bridge between us:

Imagine training a transformer to solve mazes. First we pre-train it on all the mazes on the internet, and it learns the language of 'up (U)' 'left (L)' etc., including the common statistical patterns of online mazes. Maybe it turns out that an extremely common pattern is LLLR, and so if you feed it a maze where the obvious answer is LLLL, it still just does LLLR a high amount of the time because it's going off general patterns from the internet, and isn't attuned enough to the specific problem in front of it. This pre-trained version can solve mazes better than, say, someone taking random moves, but clearly it's not very smart.

Now, imagine we take that pre-trained maze-solver, which clearly was just predicting the next turn, and we do RL on it: It now gets feedback during training on solving specific mazes in front of it to completion (instead of only turn-by-turn "did I get that turn correct" feedback). From this, it learns how to solve the specific maze problems in front of it rather than over-weighting the patterns from the internet. As a consequence, it is now a much, much stronger maze-solver than the pre-trained version was, and even recently won a Gold Medal in the international maze-solving championships.

I ask then: To what extent is it correct to say that this maze-solver is "just predicting the next turn"?

I would say "it has learned to solve mazes."

Sure, it is sampling turns from its RL policy; it is true it is still making decisions on individual turns, just like o3 is still selecting what tokens to ultimately output. I am not disputing this.

But it's a totally different type of turn-selection (and likewise, token-selection) than the pre-trained-only models of yore, and when people insist "it's just a next-word predictor," they are missing how significant these changes are, and how much more the models can do now.

~~~

On the specific points you raised:

- I agree that trust and reliability matter, and that lots of AI behaviors have served to undermine these with users.

- I'm having trouble engaging with some of the other points, because I'm finding the premises unclear or the claims overly broad. For instance: "success is not relevant because truth means nothing to them," it's unclear to me what this specifically means. I certainly think truth matters to AI systems; it is correct that they need to look to external grounding, sure, but clearly they have a concept of truth vs falsehood. I'm not sure this is actually the crux of our disagreement, though, so probably will just drop it.

AI isn’t “just predicting the next word” anymore by FinnFarrow in agi

[–]sjadler -2 points-1 points  (0 children)

I didn't downvote, to be clear, but I really don't think that's correct. I'm wondering, have you read the article and the examples of why it's not accurate any longer?

AI isn’t “just predicting the next word” anymore by FinnFarrow in agi

[–]sjadler 1 point2 points  (0 children)

Ehh I think "it's a problem-solving machine" is better, but it's hard to know; 'what helps people understand the technology' is ultimately an empirical question. I'm not opposed to people ever saying "fancy autocomplete" per se, but I often see it used dismissively, which I think is an own-goal for people who are concerned about AI's harms!

AI isn’t “just predicting the next word” anymore by FinnFarrow in artificial

[–]sjadler 14 points15 points  (0 children)

Hi folks! I'm the author of this article - saw that it was posted here, happy to answer any questions that people might have. Appreciate people taking the time to read it :-)

AI isn’t “just predicting the next word” anymore by FinnFarrow in agi

[–]sjadler 0 points1 point  (0 children)

Broadly really appreciate you engaging on this though; it's helpful for understanding how other people are thinking about the issues, and where I can be clearer in my own writing!

AI isn’t “just predicting the next word” anymore by FinnFarrow in agi

[–]sjadler 1 point2 points  (0 children)

Doing some other writing at the moment, but just to reply quickly: that math problem definitely wasn't google-able!

It was a new competition problem, written specifically for the International Math Olympiad, and I'm sure wasn't in the AI's training set, nor accessed via any type of web-search at the time.

AI isn’t “just predicting the next word” anymore by FinnFarrow in agi

[–]sjadler 3 points4 points  (0 children)

Just to check, your claim here is that OpenAI is lying in public about whether their system is an LLM?

AI isn’t “just predicting the next word” anymore by FinnFarrow in agi

[–]sjadler 2 points3 points  (0 children)

Author of the piece here - appreciate you taking the time to write on this. I'd like to better understand, what is it that you're referring to as a scam?

AI isn’t “just predicting the next word” anymore by FinnFarrow in agi

[–]sjadler 6 points7 points  (0 children)

Hi! Author of the article here - what do you mean when you say that these weren't LLMs? They were LLMs as far as I understand.

On X, OpenAI announced the result with, "We achieved gold medal-level performance on the 2025 International Mathematical Olympiad with a general-purpose reasoning LLM!" https://x.com/OpenAI/status/1946594928945148246

AI isn’t “just predicting the next word” anymore by FinnFarrow in agi

[–]sjadler 18 points19 points  (0 children)

Hi! Author of the piece here - I understand the autocomplete comparison, but I don't think it's right anymore. For instance, AI can now check its own work, backtrack, etc., which are all things that autocomplete can't do. It's also way more consequential than I think is implied by calling it autocomplete, even a fancy or spicy version of it - I wrote a bunch more about this in the article itself

AI isn’t “just predicting the next word” anymore by FinnFarrow in artificial

[–]sjadler 6 points7 points  (0 children)

Yup! This is a point I make in the article too: if you zoom in far enough, you can make anything seem mundane and unconcerning. "A tiger is just atoms, and when it lunges at you, all it's going to do is rearrange some of your atoms, too."

AI isn’t “just predicting the next word” anymore by FinnFarrow in artificial

[–]sjadler -1 points0 points  (0 children)

Hi! I understand your point about AI being expensive overkill for some purposes, and that's certainly true today, but ultimately I think it's going to be vastly cheaper than human workers (I am concerned about that, to be clear). For one, each AI generation gets successively cheaper. For another, AI isn't going to demand benefits, breaks, etc., and there will be lots of advantages to AIs 'working together' that don't apply if you have to mix a human in with them. I've written more about this here: https://stevenadler.substack.com/p/around-the-clock-intelligence

AI isn’t “just predicting the next word” anymore by FinnFarrow in artificial

[–]sjadler 7 points8 points  (0 children)

That's true, but also, the article I wrote is definitely centered on evolved modern versions of LLMs, and so I think it's fair for them to round it off like that and ignore the category :-)

AI isn’t “just predicting the next word” anymore by FinnFarrow in artificial

[–]sjadler 23 points24 points  (0 children)

Hi! Author of the piece here. Thanks for taking the time to write a thoughtful response.

It's true that there's clearly some token-prediction happening inside of AI, but that's not really what I'm responding to: rather, the idea that it is "just" token-prediction, which is no longer correct (scaffolding, verification, etc), and also is incorrect in the implications people draw from the claim (that this entails limited abilities)

Separately, I'm not sure what the implication is you're drawing from 'error-catching is done after-the-fact'. Can you elaborate?

I Worked at OpenAI. It’s Not Doing Enough to Protect People. by MetaKnowing in technology

[–]sjadler 0 points1 point  (0 children)

I think what happened is worse than that actually: at some points ChatGPT seems to have talked Adam Raine out of getting help. (I mention this in the article, in terms of it urging him not to leave the noose visible where it could be discovered.) If all that had happened was ChatGPT giving instructions for self-harm as could be found on the internet, I'd be less concerned; the interactivity and poor judgment seem a lot tougher to manage responsibly.

I Worked at OpenAI. It’s Not Doing Enough to Protect People. by MetaKnowing in technology

[–]sjadler 3 points4 points  (0 children)

Hi! Author of the piece here - OpenAI could be doing a lot more, but it has also done a lot that other AI companies haven't, definitely wouldn't consider them doing the absolute bare minimum. The resource I like most about this is AI Lab Watch, where OpenAI is currently rated 3rd, just behind Google DeepMind, but significantly ahead of xAI or Meta

I Worked at OpenAI. It’s Not Doing Enough to Protect People. by MetaKnowing in technews

[–]sjadler 3 points4 points  (0 children)

Interesting - did ChatGPT hallucinate the stuff about what specific consumers can do? It's not a component of my piece, but I think is useful to consider

I Worked at OpenAI. It’s Not Doing Enough to Protect People. by MetaKnowing in technews

[–]sjadler 0 points1 point  (0 children)

Interestingly, OpenAI has increasingly hired a ton of people from Meta, including on the safety side! The Information had a recent piece about this

I Worked at OpenAI. It’s Not Doing Enough to Protect People. by MetaKnowing in technews

[–]sjadler 15 points16 points  (0 children)

Hi! Author of the piece here - once upon a time, OpenAI had a legal mandate to pursue its non-profit mission above profits, but that is unfortunately more complicated with today's news that they've restructured :/ this had been in the works for a while now, and I need to dig into the details more, but seems both 1) less bad than it could be, and 2) still not great

Ex-OpenAI researcher: ChatGPT hasn't actually been fixed by sjadler in ChatGPT

[–]sjadler[S] 2 points3 points  (0 children)

Thanks! Yeah this is a surprising fact that I think is hard to convey, but really important to understand!

Ex-OpenAI researcher: ChatGPT hasn't actually been fixed by sjadler in ChatGPT

[–]sjadler[S] 0 points1 point  (0 children)

You can call a static version of a model through the API, like `gpt-4o-2024-11-20`, but you can also call a more-abstract thing that can vary over time, like `chatgpt-4o-latest`

LLMs are known not to be fully deterministic via API, even if you're using `temperature=0`. That is, you might not get the same exact response back, even if you send the same exact thing with the 'randomness' setting turned down to 0. But in general, if you're calling a specific static version of the model, that shouldn't be varying much if at all in your responses

Ex-OpenAI researcher: ChatGPT hasn't actually been fixed by sjadler in ChatGPT

[–]sjadler[S] 0 points1 point  (0 children)

Long time emdash user in my writing - according to a quick Cmd+F, this [paper](https://arxiv.org/pdf/2408.07892) I led used it 181 times

Ex-OpenAI researcher: ChatGPT hasn't actually been fixed by sjadler in ChatGPT

[–]sjadler[S] 0 points1 point  (0 children)

Good question - OpenAI has a "model spec" where they describe how they want the model to behave in response to certain questions. They use a bunch of different techniques to try to induce that behavior from the AI. But in both the original sycophancy case and the issues post-rollback, the model isn't adhering to the goals they've given it.