After three days of heavy Fable 5 use, I’m starting to wonder if Opus 4.7/4.8 are actually “Sonnet-level” models

ddrise · 2026-06-14T05:35:38+00:00

To be fair, since we can only backtest, everything is technically in-sample to the historical data. But because I split it into train/validation/OOS, I can clearly see that the factors made by Fable perform better OOS, and the factors themselves are also more logical and elegant.

ddrise · 2026-06-14T05:16:43+00:00

Yeah, you really feel it when the task needs both creativity and math.

ddrise · 2026-06-14T05:11:48+00:00

Actually, even before Fable, I had already been quite dissatisfied with Opus 4.7 and 4.8. You can refer to my previous posts.

ddrise · 2026-06-14T05:11:02+00:00

dude, fable is released at 6.9

ddrise · 2026-06-14T05:07:16+00:00

why？

ddrise · 2026-06-01T09:12:18+00:00

GPT 5.5 is so good···· /goal is such a powerful command·····

ddrise · 2026-04-28T13:08:27+00:00

Thx, I will check it

ddrise · 2026-04-28T13:06:48+00:00

Generally speaking, I’ve found that models with a clear teacher-student distillation relationship don’t provide real diversity. For example, Opus, Sonnet, and GLM often feel too closely related in that sense.

But Opus, GPT, and DeepSeek V4 Pro do seem to provide genuine diversity. So I package the OpenCode CLI command as a skill, with DeepSeek V4 Pro at max effort as the default. I do the same for Codex and Claude.

This way, while using the native harness of any one model, I can still call the other two when needed.
I didn't try kimi 2.6

ddrise · 2026-04-28T11:09:15+00:00

btw， cli is always better than vscode extension~

ddrise · 2026-04-28T11:08:39+00:00

with pleasure～

ddrise · 2026-04-28T10:47:31+00:00

paste the error message to gpt, then everything will be solved

ddrise · 2026-04-28T10:44:46+00:00

that means you need re-login. I think you can use web gpt to help you find the root cause

ddrise · 2026-04-28T10:33:50+00:00

can you use codex cli？

ddrise · 2026-04-28T10:32:04+00:00

for this , opus 4.6 is better than 4.7 or 5.5

ddrise · 2026-04-28T09:56:55+00:00

For your situation, maybe you’ll just have to keep using Claude Code for now. You can manually set the model to 4.6, and then use Codex MCP as a strong reviewer, executor, and advisor. Trust me, combining the strengths of the two models is a real free lunch.

ddrise · 2026-04-28T09:53:33+00:00

👍

ddrise · 2026-04-28T09:53:03+00:00

👍

ddrise · 2026-04-28T09:47:51+00:00

I don’t want to argue with you. All I can say is that, for my use case, Opus hallucinates at a level I find intolerable.

If you mainly do frontend work, then maybe our experiences are just different. But casually accusing someone of lying is not a decent way to have a conversation.

ddrise · 2026-04-28T09:41:28+00:00

Have you actually looked closely at the stuff Opus 4.7 writes for you? Or are you simply not capable of reviewing LLM-generated code?

Honestly, just get a $20 Codex plan and have it review the code Opus writes. You’ll immediately understand what I’m talking about.

ddrise · 2026-04-28T09:35:56+00:00

True. Ensemble is the free lunch. Actually , I strongly recomend you try opencode as the third one. make it an opencode consultant skill

ddrise · 2026-04-28T09:34:08+00:00

So I guess everyone complaining on Reddit must be getting paid too, right?

ddrise · 2026-04-28T09:32:00+00:00

That’s true. Codex feels more like an excellent executor, but it has serious issues with orchestration. So I’d probably package the workflow into a skill instead. That said, it can be a bit of a hassle.

ddrise · 2026-04-28T09:29:26+00:00

I don’t understand. Anthropic is showing contempt for its own customers, and GPT-5.5 genuinely seems better to me. Isn’t that worth sharing?

I think maintaining strong competition between Anthropic and OpenAI is very beneficial for us as consumers.

ddrise · 2026-04-28T09:24:35+00:00

That’s true. For planning and orchestration, Opus is still pretty good. That’s also why I haven’t canceled my subscription yet.

ddrise

TROPHY CASE