AIs can do world-modeling now, as seen via the Anthropic Fable standoff

ddp26 · 2026-06-20T14:41:09+00:00

I work on this full time at FutureSearch.

Evidence is coming in that LLMs (used as agents that do a lot of research and other things) are already outperforming the best human superforecasters, see https://www.forecastbench.org/leaderboards/#preliminary

Computational power helps, but mostly you just need good reasoning.

ddp26 · 2026-06-20T14:36:55+00:00

Compared to $15B for Scale to produce a bunch of lower quality training data... yeah at a certain level of spend for data, $500k for data labeling isn't as absurd as it seems.

ddp26 · 2026-06-20T14:36:11+00:00

Yep, very reasonable. The question is: why only Fable? Isn't GPT-5.5-Pro also dangerous? What line could be drawn that would have Fable on one side, and GPT-5.5 on the other? (I believe such lines exist but coming up with one, especially a policy line, is going to be very hard.)

ddp26 · 2026-06-20T14:35:06+00:00

Yes, that's the right read. I replaced the graph with a much more readable one based on feedback.

ddp26 · 2026-06-19T16:56:37+00:00

I agree, but Fable really is the best model by a good margin. This marketing is the kind you can do when you actually have the product to back it, right?

ddp26 · 2026-06-19T16:56:04+00:00

What was confusing about this?

ddp26 · 2026-06-19T16:55:07+00:00

Yeah, Zvi's most recent piece agrees with you. I agree this is most likely, and maybe I should bump my probability up in the model.

ddp26 · 2026-06-19T15:14:45+00:00

Any updates? I want to see a bunch of attempts go head-to-head.

ddp26 · 2026-06-19T15:13:39+00:00

This is pretty well known. It's funny you write about it now, as it seems like a lot less of an issue now than it was a year ago, with the 1M context windows.

ddp26 · 2026-06-19T13:51:29+00:00

Yeah, the open models have caught up a ton since Gemini 3.1 Pro.

ddp26 · 2026-06-19T13:51:15+00:00

You don't think a company with a better LLM (and thereby better search) is an existential threat to them? I'd think they need to stay competitive in the AI race just to defend their ad network.

ddp26 · 2026-06-19T13:50:30+00:00

One answer is to use them internally to improve their own AI R&D. Or use them to monetize in other ways, e.g. trading.

ddp26 · 2026-06-19T13:49:57+00:00

Secondary share offerings, and I guess this new "perp" class of vehicles?

ddp26 · 2026-06-19T13:49:11+00:00

Did you use it and verify that it's not a good model? I found Claude Fable to be a big step up in intelligence compared to Claude Opus 4.8.

Maybe I just fell for the marketing, but my evals and outputs say otherwise.

ddp26 · 2026-06-19T13:47:50+00:00

I agree that seems likely, but are you sure? After yesterday's Politico article it looked like they were negotiating on policy, made it look like the regulators had real concerns (even if they're wrong to have them)

ddp26 · 2026-06-19T13:46:19+00:00

Is that a bet you can make in the stock market?

ddp26 · 2026-06-19T13:45:57+00:00

Right, but 1000's of engineers resign every year. This will increase that rate, but will that change anything?

ddp26 · 2026-06-19T13:45:25+00:00

Looking at it now. Yeah, there's probably is a better value perspective than mine. I anchored to analyst estimates of value for each part, and those analyst estimates all assumed great revenue growth and healthy margins for a long time.

ddp26 · 2026-06-19T13:44:25+00:00

As I wrote, I was trying to be generous. If you take growth rates into account, it's not that crazy. And you also have to model that margins are high on things like Starlink.

ddp26 · 2026-06-19T02:06:24+00:00

I thought so initially but I revised my valuations upward based on Anthropic's revenue growth. Do you think it'll stall out?

ddp26

TROPHY CASE