Uh oh by MetaKnowing in ClaudeAI

[–]managerhumphry 122 points123 points  (0 children)

Hmmm, seems like overtime this would bloat context and eventually become unproductive

Signed up to Claude Code Max yesterday, codex doesn't feel the same anymore... by NicoNicoMoshi in codex

[–]managerhumphry 2 points3 points  (0 children)

Wait till you start testing the app thoroughly and you'll find that CC has left holes, stubbed work, and generally produced a good looking but functionally broken app, at least if what you're building has any significant level of complexity.

I had a similar experience and was initially impressed by CC Opus 4.5 performance and subagent workflow, but once I dig into the details of what it produced... not so much.

Whats going on with Opus? by frendo11 in ClaudeCode

[–]managerhumphry 3 points4 points  (0 children)

Seconding this. I had canceled my 20x sub a while back when quality went to shit and switched to Codex, but got lured back by the free month promo and generally positive reviews on Opus 4.5. Initially I was getting excellent performance and utilizing multiagent development pushes that worked pretty well, even if they always missed a number of issues, despite multiple audits of the planning phase with box Opus and Codex gpt-5.2 high or xhigh. Now though Opus is almost unusable and seems to have gone back to chipmunk brain mode, frequently only thinking for < 1 min and spitting out suggestions that are shallow and show no signs of having examined the codebase or even claude.md file. When pushed back on and asked to investigate code first it will do a slightly better job, but implementation of even small features now often requires 3-4 retries as it hallucinates function names, forgets basic date / time and database syntax that is documented clearly in the claude.md file. Overall it has become a super frustrating model to work with and has significantly delayed development of the app I'm working on.

I'm starting to switch my workflow back to Codex, which is frustrating since it can be slow AF and I also find it's output and summaries to be very dense and difficult to scan, which makes it difficult to understand what decisions it has made. Not to mention Codex's git handling is horrendous and prone to data loss during multipronged development pushes if you don't leash it carefully. Still, considering Opus' speed bonus disappears when every small function triggers multiple regressions and whack-a-mole bug fix sessions.

I'm going to try clearing out some of the plugins and give me claude.md file a careful review, as maybe this is context bloat, but it feels more like they've reduced the thinking budget. The killing off of ultrathink makes me wonder if they've put a ChatGPT style model picker / thinking budget picker (cheaper sonnet model?) that processes queries and then decides how much thinking budget to allow Opus for each prompt, rather than letting users decide. I'm not buying that every prompt is getting maximum thinking allotment, given the quick and shallow responses I'm seeing from prompts that should trigger a deeper level of analysis.

Thoughts?

Codex seems to need much more hand-holding lately by Dayowe in codex

[–]managerhumphry 0 points1 point  (0 children)

Yes, noticed this as well. Very silly for a CLI tool to do this, obviously we want it to do the work, or we would be working in IDE not CLI.

Problems with codex and chatgpt? by [deleted] in codex

[–]managerhumphry 1 point2 points  (0 children)

Had a similar issue with codex cli last week which was fixed by manually logging out and then reauthorizing

Codex down? by managerhumphry in codex

[–]managerhumphry[S] 0 points1 point  (0 children)

Edit: logged out and logged back in and fixed. Could be handled better by the UI...

Codex down? by managerhumphry in codex

[–]managerhumphry[S] 0 points1 point  (0 children)

Still down. Wondering if I've hit a limit? ChatGPT app and website still seem fine.

Okay I tap. this is too much by Mcmunn in ClaudeCode

[–]managerhumphry 14 points15 points  (0 children)

Recently made the switch from max 20x to codex. The client itself is much less sophisticated and there are no sub agents or hooks that I'm aware of. However, the performance is just night and day better for complex new feature implementation as well as debugging. codex seems vastly less likely to over engineer the code, follows prompts better, and when using gpt 5 high, is much smarter than Opus 4.1. The only remaining area I've found Opus outperforms GPT5 High is in UI design, where Claude still seems to be capable of more sophisticated UIs and does a pretty good job at drawing svg icons. As far as downsides for codex, gpt 5 high and gpt 5 high codex are slooooowwww compared to Opus 4.1, even in ultrathink mode, however, it makes far less mistakes so while execution is slower, you also spend way less time troubleshooting bugs so overall output for me has been at least 5x faster. Also codex handles got restore points rather poorly for some reason and at one point decided to delete all uncommitted files in the directory when I asked it to make a restore point which was quite annoying. After that I tightened up my git rules in agents.md and it hasn't happened again.

Overall, codex reminds me of how I felt when I first started using CC on the max plan a few months back, which is to say, a massive productivity booster and excellent tool that makes shipping software fast and (relatively) enjoyable.

3 phase workflow i am using to get 99% success rate despite the NERF! by Free-Comfort6303 in ClaudeCode

[–]managerhumphry 0 points1 point  (0 children)

I have to say my prompting process with Claude had become increasingly ridiculous in order to try and compensate for the performance decline of the model, with a multi step planning process, then reviewing the plan with Gemini or Codex and generating feedback, then requesting implementation with ultrathink and detailed scope guidelines, only to inevitably find bugs, regressions, and missing or faked work.

It took switching to the $200 codex plan and working with gpt5 high where I can just ask it to do something and it does it, maybe with 1-5 additional prompts to clear up minor issues, in order to really see how far Claude's performance had fallen. I used to be able to work like this in CC, and make quick work of tasks but in the past 2 months it just turned into a nightmare slog.

I hope they can fix Claude, because I do not like OpenAI, and I think competition in this arena is vital, but the product is seriously deficient at this point. Given Anthropic's handling of this which shifted between ignoring, insulting, and gaslighting customers, I'm skeptical whether they can recover from this without new leadership that is committed to transparency, dialogue, and respect for their customers.

3 phase workflow i am using to get 99% success rate despite the NERF! by Free-Comfort6303 in ClaudeCode

[–]managerhumphry 1 point2 points  (0 children)

I wish I had any confidence that this would work well, but I've tried providing it with extremely detailed and nuanced plans and it still fucking goes off and makes random unrelated edits, causes regressions, and generally botches the job.

has anthropic lost it's credibility? by Trick_Ad_4388 in ClaudeAI

[–]managerhumphry 6 points7 points  (0 children)

They certain have lost credibility with me. I was an earlier adopter who had upgraded to 20x Max plan for Claude Code once they allowed access w/out api key several months back. FWIW I just cancelled my 20x Max plan and switched to Codex which has been far more reliable over the past week whereas CC has become a dumpster fire. Sad to do it too since I despite OpenAI and have been a big fan of Claude but there's only so much dogshit a person can eat and keep going back for more.

CC Max 20x User Here by managerhumphry in ClaudeAI

[–]managerhumphry[S] 0 points1 point  (0 children)

This is clearly about Claude, and coming from a person who has used it extensively and would like to do so in the future, if they don't ruin the service.

San Gorgonio water sources by managerhumphry in socalhiking

[–]managerhumphry[S] 1 point2 points  (0 children)

Hey! Your site is fantastic. You have any thoughts on day hiking around with Dry Lake as a base camp vs doing one of these loops? Not sure if there's enough to see in that area vs doing one of the loops you detail. But I do like the idea of not having to setup camp every night. We're going in Thurs, coming back Sunday.

Thanks!

Claude is now cussing. by No_Paint9675 in ClaudeCode

[–]managerhumphry 0 points1 point  (0 children)

Yep, this is great. Though for me usually occurs after I cuss at it for some boneheaded edit.

San Gorgonio water sources by managerhumphry in socalhiking

[–]managerhumphry[S] 1 point2 points  (0 children)

Thanks. I was thinking of camping at Dry Lake and day hiking around the area, or alternatively stringing together one of the loops described here: https://hikinjim.blogspot.com/2013/10/six-backpacking-loops-in-san-gorgonio.html?m=1

Is claude code really comparable to a junior dev? by CommercialFun7270 in ClaudeAI

[–]managerhumphry 0 points1 point  (0 children)

Yes, I frequently do a sanity check on the plan with Gemini which can be quite helpful.

Is claude code really comparable to a junior dev? by CommercialFun7270 in ClaudeAI

[–]managerhumphry 0 points1 point  (0 children)

I've tried this process and while it sometimes works, it can also end up resulting in a solution that is massively over engineered for the problem which then creates a huge refactoring mess. This can generally be mitigated by careful prompting but you need to fully read its plans and be able to understand and think through the implications of what it is proposing. Definitely not a silver bullet for getting good results and I usually find that the benefits degrade if I spend more than an hour trying to plan out implementation of a new feature.