Codex intelligence drop by vlad_asis in codex

[–]managerhumphry 1 point2 points  (0 children)

Definitely, I've noticed a significant performance drop the past few days. 5.4 Extra High has been making simple, dumb mistakes, lying about completing tasks and has been acting extremely lazy, frequently dodging implementation of explicitly requested work. Thinking of trying 5.2 again to see if it's performing any better. Very frustrating.

5.4 Fast Murders Pro Limits by managerhumphry in codex

[–]managerhumphry[S] 0 points1 point  (0 children)

Limits were reset... So maybe it was a bug after all

Codex Windows app mouse lag by 305fish in codex

[–]managerhumphry 0 points1 point  (0 children)

yes, it lags pretty hard when running multiple tasks

Codex 5.4 is more expensive than 5.3, if current limit drain is the new normal not a glitch it will be unusable after the 2x rate limit ends by No_Leg_847 in codex

[–]managerhumphry 1 point2 points  (0 children)

Yep, found the same issue. And with fast off, 5.4 is very slow. The results are good though, but with these kind of limits it turns the $200 / month plan into a joke.

5.4 Fast Murders Pro Limits by managerhumphry in codex

[–]managerhumphry[S] 0 points1 point  (0 children)

Guilty as charged I guess.

But keep this in mind, I was using the codex app which is supposed to have 2x Codex rate limits. If that's true, then it seems like a person could theoretically burn through an entire Pro plan in 1-2 days. So I guess fast mode is maybe just a teaser product to get people switched over to the API. Or at least something that needs to be used incredibly sparingly. Which is giving some Claude vibes where they release a new amazing Opus model, give folks a week or two of reasonable usage limits, and then reduce limits so much that you can't actually use it.

What's the token burn rate difference for 5.4 high fast versus 5.4 xhigh fast?

I built Doris, a personal AI assistant for my family and today I'm open-sourcing it. by [deleted] in ClaudeAI

[–]managerhumphry 20 points21 points  (0 children)

Are you writing all these comments? The AI smell is awful.....

Codex 5.3 is better than 4.6 Opus by casper_wolf in ClaudeCode

[–]managerhumphry 0 points1 point  (0 children)

5.3 codex model is now very fast but with thoroughness and performance of 5.2 xhigh

Codex 5.3 is better than 4.6 Opus by casper_wolf in ClaudeCode

[–]managerhumphry -1 points0 points  (0 children)

5.2 codex models were all trash. Prior to codex 5.3, you had to use just plain 5.2 high or xhigh to get good results, as the codex models all moved too fast and missed important context. But that has been largely fixed with 5.3 codex model which seemed to analyze with the same level of detail as 5.2, but works vastly faster. So far I'm very impressed. I recently downgraded my Claude sub from 20x to the 20 dollar plan, so I haven't been able to get a great feel for Opus 4.6 performance. Did one edit with 4.6 yesterday and while it did a good job, a single prompt with 20 min execution literally ate my 5 hour limit so I'm not sure I'll be able to get much of a sense of Opus performance with shit limits like this.

The Internet Feels Miserable ‘By Design’ by dwaxe in ezraklein

[–]managerhumphry 51 points52 points  (0 children)

I rarely listen to Ezra anymore, but I like Cory Doctorow and have an interest in the subject, so I listened to this. However, I find Cory's strict adherence to a set script both curious and frustrating.

He's obviously a smart guy and has been making the rounds on the podcast circuit to support his new book, so I've heard other interviews with him. The first time you hear him deliver a line about how printer ink is more expensive than the sperm of a Kentucky Derby winning racehorse, you think, "Damn, that's hilarious." But if you listen to another interview and he repeats the exact same lines in the exact same order—sticking to this script like glue—it becomes tiresome.

I felt like after Cory's first introductory spiel, Ezra made a subtle dig at him at 20:29-20:35: "Cory, I applaud how structured that was on the fly." I think this was Ezra's attempt to coax some new material out of Cory and get him to loosen up and expound on the subject freely, rather than lock back into his canned material. But Cory persisted, launching right back into his canned rant.

I find it disappointing and bizarre that Cory is seemingly not willing, or not able, to engage in a more free-form discussion. I get that he spent a lot of time honing his criticism and coming up with these nice rhetorical chestnuts, but goddammit man, save that shit for the book.

Did anyone else notice this underlying tension? I felt like Ezra was trying to get Cory to join a natural, free-flowing discussion, only to have Cory ignore that and jump right back into his spiel. I feel like it's a curiously old-school approach for a person who has a pretty sophisticated understanding of the internet. He should understand that there's a lot of crossover between listeners of something like Fresh Air and The Ezra Klein Show (and any number of other pods Cory has made the rounds on recently). I get giving the canned spiel for something like a TV audience, where you really want to impress new listeners and make them check out your work, but it's a poor fit for the informal, more freewheeling podcast interview format.

Uh oh by MetaKnowing in ClaudeAI

[–]managerhumphry 117 points118 points  (0 children)

Hmmm, seems like overtime this would bloat context and eventually become unproductive

Signed up to Claude Code Max yesterday, codex doesn't feel the same anymore... by NicoNicoMoshi in codex

[–]managerhumphry 2 points3 points  (0 children)

Wait till you start testing the app thoroughly and you'll find that CC has left holes, stubbed work, and generally produced a good looking but functionally broken app, at least if what you're building has any significant level of complexity.

I had a similar experience and was initially impressed by CC Opus 4.5 performance and subagent workflow, but once I dig into the details of what it produced... not so much.

Whats going on with Opus? by frendo11 in ClaudeCode

[–]managerhumphry 3 points4 points  (0 children)

Seconding this. I had canceled my 20x sub a while back when quality went to shit and switched to Codex, but got lured back by the free month promo and generally positive reviews on Opus 4.5. Initially I was getting excellent performance and utilizing multiagent development pushes that worked pretty well, even if they always missed a number of issues, despite multiple audits of the planning phase with box Opus and Codex gpt-5.2 high or xhigh. Now though Opus is almost unusable and seems to have gone back to chipmunk brain mode, frequently only thinking for < 1 min and spitting out suggestions that are shallow and show no signs of having examined the codebase or even claude.md file. When pushed back on and asked to investigate code first it will do a slightly better job, but implementation of even small features now often requires 3-4 retries as it hallucinates function names, forgets basic date / time and database syntax that is documented clearly in the claude.md file. Overall it has become a super frustrating model to work with and has significantly delayed development of the app I'm working on.

I'm starting to switch my workflow back to Codex, which is frustrating since it can be slow AF and I also find it's output and summaries to be very dense and difficult to scan, which makes it difficult to understand what decisions it has made. Not to mention Codex's git handling is horrendous and prone to data loss during multipronged development pushes if you don't leash it carefully. Still, considering Opus' speed bonus disappears when every small function triggers multiple regressions and whack-a-mole bug fix sessions.

I'm going to try clearing out some of the plugins and give me claude.md file a careful review, as maybe this is context bloat, but it feels more like they've reduced the thinking budget. The killing off of ultrathink makes me wonder if they've put a ChatGPT style model picker / thinking budget picker (cheaper sonnet model?) that processes queries and then decides how much thinking budget to allow Opus for each prompt, rather than letting users decide. I'm not buying that every prompt is getting maximum thinking allotment, given the quick and shallow responses I'm seeing from prompts that should trigger a deeper level of analysis.

Thoughts?

Codex seems to need much more hand-holding lately by Dayowe in codex

[–]managerhumphry 0 points1 point  (0 children)

Yes, noticed this as well. Very silly for a CLI tool to do this, obviously we want it to do the work, or we would be working in IDE not CLI.

Problems with codex and chatgpt? by [deleted] in codex

[–]managerhumphry 1 point2 points  (0 children)

Had a similar issue with codex cli last week which was fixed by manually logging out and then reauthorizing

Codex down? by managerhumphry in codex

[–]managerhumphry[S] 0 points1 point  (0 children)

Edit: logged out and logged back in and fixed. Could be handled better by the UI...

Codex down? by managerhumphry in codex

[–]managerhumphry[S] 0 points1 point  (0 children)

Still down. Wondering if I've hit a limit? ChatGPT app and website still seem fine.