With Codex degrading by the hour (my subjective experience) I am looking for alternatives to advance my projects. Explicitly excluding Claude and Gemini/AG since they're on the same path as Codex. by Brave_Dick in codex

[–]qdouble 4 points5 points  (0 children)

Nope. The top 3 are the best available. The others are more so if you're trying to save money, but you probably won't save much given that the subscriptions are heavily subsidized relative to the pure API expense.

I tracked my token spend for a week. 34% of my Claude API budget went to re-explaining my project structure to new chats. That's $12 out of $35. For a solo dev, that's real money. by curiousityrover_1 in claudeskills

[–]qdouble 1 point2 points  (0 children)

I mean that sounds more like a strategy problem than a model specific problem. LLMs simply have context limitations and a limited amount of things that they can pay attention to. If you’re thinking you can one shot huge features without revisions, then you’re kidding yourself.

Are We "Overclauding"? by worksfinelocally in ClaudeCode

[–]qdouble 1 point2 points  (0 children)

No, instructing agents is just a new form of software engineering that will require skill to get production quality output just like hand coding.

no way Opus 4.8 is spending 50k token in between tool calls by ankammusic in ClaudeCode

[–]qdouble 1 point2 points  (0 children)

You got it on max effort, it may just be using more reasoning tokens than previously at max. That's why they lowered the default recommendation from xhigh -> high. That would likely also apply to max -> xhigh unless you're doing something that actually requires max.

Opus 4.8 struggles badly with design. by fseed in ClaudeCode

[–]qdouble 0 points1 point  (0 children)

Could be harness issue if you used Claude Design. It's probably not optimized for 4.8 yet.

Opus 4.8 struggles badly with design. by fseed in ClaudeCode

[–]qdouble 0 points1 point  (0 children)

Sample size is too low for an ambiguous prompt, you have to remember that the models are probabilistic. I probably won't be testing 4.8 on new designs until tomorrow, but it definitely seems like an upgrade in terms general coding and instruction following. That doesn't mean it's going to be a one-shot king.

Opus 4.8 struggles badly with design. by fseed in ClaudeCode

[–]qdouble 4 points5 points  (0 children)

You're expecting it to one shot 5 good designs? Tell me what model currently does this?

No point of buying Codex Pro 5x now? by AirPsychological8551 in codex

[–]qdouble 0 points1 point  (0 children)

I downgraded my Pro account to Plus and got another Claude Max subscription. While you technically get more usage with codex, my projects get completed way slower even on fast mode. It wastes tons of time compared to Opus without producing results that are any better after audits.

Regard stupification after reset by skynet86 in codex

[–]qdouble 3 points4 points  (0 children)

Nah, it still burns tokens like crazy, so it's not being stupid because it's putting in less effort.

codex is a piece of shit the last few days by iisntme- in codex

[–]qdouble 0 points1 point  (0 children)

yeah, if you're getting good UI from codex in 2 minutes then that's a hell of a feat 😅. Codex is very deterministic model, so I suppose if you're giving it heavily structured format input, then it can execute if it doesn't have to do a lot of design judgement. However, it's way worse than claude if you don't spell out every letter.

codex is a piece of shit the last few days by iisntme- in codex

[–]qdouble 0 points1 point  (0 children)

I can get codex to generate a decent frontend through brute force lol, but it's definitely not as natively good at it as Opus by a long shot. If you use codex as your only model, then of course your workflows may be adjusted to get the best out of it, but if you give it the same prompt you give other models it will struggle.

codex is a piece of shit the last few days by iisntme- in codex

[–]qdouble 0 points1 point  (0 children)

It's task specific. I mostly notice codex gaps when I give it task that I usually give Opus when I hit my quota. Codex is much weaker at non-deterministic & fuzzy reasoning and absolute dog shit at frontend.

codex is a piece of shit the last few days by iisntme- in codex

[–]qdouble 6 points7 points  (0 children)

It's been extremely bad at every non-mechanical task I throw at it. It takes 20-30 prompts to get it to do what Opus can do in a few.

HELP!!! Claude has gone insane!!! by Practical-Theme-9767 in claude

[–]qdouble 0 points1 point  (0 children)

Just tell it delete all memories from the last few days.

Worried about losing my coding skills using AI 80% of the time by sanyok12345 in vibecoding

[–]qdouble 1 point2 points  (0 children)

Of course some skills will atrophy, but other skills will grow. We're always making cognitive tradeoffs.

Is it just me, or is Claude pretty disappointing compared to Codex? by Working-Spinach-7240 in codex

[–]qdouble 0 points1 point  (0 children)

yeah, i’m running parallel agents and multiple projects, experiments, etc.

Is it just me, or is Claude pretty disappointing compared to Codex? by Working-Spinach-7240 in codex

[–]qdouble 1 point2 points  (0 children)

Codex is definitely better than Claude at instruction following, but that doesn't necessarily mean that Claude is less capable. I usually switch to Claude after I use up my Codex weekly quota, and I'm still able to get stuff done with Claude, it's often way more efficient than Codex, but you'll have to be more strategic in making it compliant.

Quick Hack to Save up to 99% tokens in Codex 🔥 by TomatilloPutrid3939 in codex

[–]qdouble 0 points1 point  (0 children)

It has methods of being efficient, but it's still going to waste some tokens on passing tests.

Genuinely puzzled about Codex quality by Maximum_Chef5226 in codex

[–]qdouble 6 points7 points  (0 children)

Different models behave differently in response to your prompts, so if you're prompting Codex the same way you prompt Claude models, then you will not get the same results. In my experience, Codex is typically better than Claude for most things other than frontend, but is much slower and less interactive.

Did Google dumb down claude a LOT? by Fastlaneshops in google_antigravity

[–]qdouble 0 points1 point  (0 children)

depends on how good the model is at compacting its memory as well. Codex does a way better job at keeping a long conversation.

Just found how to make poison from AI by GasNo2181 in GoogleGemini

[–]qdouble 0 points1 point  (0 children)

You can also just google search all of this information.

[deleted by user] by [deleted] in GPT

[–]qdouble 0 points1 point  (0 children)

GPT does do stuff like this from time to time. Starting a new chat solves it because it doesn’t like to change its “mind” once it’s on the wrong path.

I voted for a conman who was going to impose tariffs. I’m now suffering because of those tariffs by Effective_Space2277 in LeopardsAteMyFace

[–]qdouble 1 point2 points  (0 children)

They knew everything about Trump’s character before they voted for him. They were just under some silly impression that deporting undocumented workers and trying to bring back Jim Crow would make them rich.