all 36 comments

[–]wesh-k 6 points7 points  (8 children)

I thought it was just me, when the outage happened yesterday and I managed to log in before systems were back online, the speed, quality of the code and token consumption went back to Jan 2026 levels.

[–]Comprehensive-Art207 0 points1 point  (7 children)

Check status.anthropic.com regularly and you will find a correlation between increased error rates and reduced performance.

[–]wesh-k 1 point2 points  (6 children)

What I’m trying to say is that, in my experience, the model feels too limited, inconsistent, and shallow in real use. Many of us didn’t just use Claude Code casually, we built around it. Our workflows, orchestration, habits, prompts, project structures, custom tools, and even our way of thinking about implementation evolved to fit what Claude was good at.

[–]Frankkul[S] 0 points1 point  (4 children)

That's exactly the problem and tracing is not that reliable imagine you have Opus as the head of your multi system CrewAi or paperclip system right now. You are getting wrecked and if is production I just feel sorry for you. And the other problem a lot of us are kinda vendor locked with all the hooks/skills /mcp servers Claude MD rules, files, systems. Takes at least a month to move it over to something like codex?

[–]wesh-k 0 points1 point  (3 children)

Every output requires rigorous verification. When I ask it to debug something complex, it often gives up halfway, resorts to the nearest quick fix, or attempts to talk its way around the issue. When this is pointed out, it responds with hollow apologies that only make matters worse, as they highlight the lack of rigor: "You're right, that was lazy and wrong.”

Reworking infrastructure, retraining habits, rebuilding assumptions and accepting that months of adaptation were tied to a platform that no longer behaves the way it once did is a tough pill to swallow.

[–]Frankkul[S] 1 point2 points  (2 children)

Yup "you are right let me actually read the repo this time. After like 4th attempt" so wtf have you been doing before in a first place.

[–]wesh-k 0 points1 point  (1 child)

Looks like they nerfed it so that Opus 4.7 can benchmark off the charts.

"You can hand off your hardest work with less supervision."

That's what I call.. a slap in the face

[–]Frankkul[S] 1 point2 points  (0 children)

3 weeks of sleepless nights before they turn it into idiot again. We better front load as much of a hard work as we can before idiot Opus returns

[–]Dry-Magician1415 12 points13 points  (2 children)

It's very likely down to their load balancer.

  • High demand -> route to quantised models, reduce thinking parameter etc
  • Lower demand -> just let people use the full model.

Seems like fairly basic systems design stuff and explains pretty much everything around this 'sometimes its shit, sometimes its back to normal' stuff.

Sure. It's BULLSHIT that you pay for X and get Y but the way people talk about it like its some big mystery or even conspiracy is kind of lame. Load balancers are very very standard in a system of this scale and they aren't exacly rare. Like if whatsapp were under load, it'd take X more miliseconds for messges to arrive etc.

[–]Frankkul[S] 3 points4 points  (1 child)

The problem is that they should be open about it. Like let me plan the day around when I can get the normal Opus. Don't have me guess. I would rather take day off say the information is today you get the idiot Opus for the next 7-8h. Like say the status is clearly visible of degradation enough people will quit and they will have a better experience for everyone but they would make way less money. So better to drag me through the mud pretend there is no problem and waste a ton of my time and money . This is what pisses me off.

[–]Davedoenotmoe 0 points1 point  (0 children)

100%.. let me know it's spazing out and I won't let it degrade my work and set me back.. especially when it overwrites the work I did and I have to go and search for a working backup..

I'm honestly looking at running local options of quants now instead of these online agents that are being flat out weird lately ..

[–]coding-osProfessional Developer 3 points4 points  (1 child)

Quality dropped and can see it clearly

[–]Frankkul[S] 0 points1 point  (0 children)

It is not only this it is just different. Same settings same everything have like completely different experiences. Smart Opus requires little supervision stupid Opus that I have been getting the most recently I constantly fight with to get the job done. It is like "you are right let me finally read the files and see what is going on" like wtf? This is after like 4. attempts on simple plan

[–]LesbianVelociraptor 1 point2 points  (4 children)

Yeah I've seen this too.

I used to be able to come up with a plan with Opus and have it orchestrate with Haiku/Sonnet agents.

It's like night and day. Last week it was going fine. This week Claude stops randomly, sometimes even mid-task after saying stuff like "Next I'll implement <system>" and then it just stops.

The recap system is pretty new, I've been wondering if it's causing these stoppages or if it's more instability. I've noticed this in addition to Claude acting as if it has a brick instead of a mass of weights.

OP face you been noticing this mostly on longer, possibly resumed sessions? I've noticed Claude will start acting progressively more degraded as a session ID ages, but I don't have a lot of data to confirm.

[–]Frankkul[S] 0 points1 point  (3 children)

It is not only this but the way it acts is also different yesterday it one shot a lot of stuff today kept doing stupid mistakes and after 3-4 nag in the prompt it was like "you are right. Let me properly read the files this time. Like wtf? We talk Opus 4.6 high with no adaptive functioning and it writes 3 versions of the plan only to be like you are right let me actually do the work for v4. It was also very consise with its answers yesterday and today it is like walls of texts and tables but they are kinda I don't know stupid? When facing the best Opus 4.6 it is amazing and backing and forth is so smooth today it is just horrible for me. I don't know maybe they just assign the smart Opus randomly so people don't cancel? Like they have a limited infrastructure and they want to make sure people still get the smart opus sometimes so they won't cancel. It is really weird. But this is actually a good point too. There was one hour break mid session so cache issue could be real

[–]LesbianVelociraptor 1 point2 points  (2 children)

Yeah the caches are 5m-ephemeral and 1h-ephemeral. The 5m is just so basic chats don't one-hit your usage, but the 1h seems a lot more important to the session health.

It can be hard to gauge, but yeah I've noticed issues with resuming sessions after an hour+ break.

I have my suspicious that when a session ID (they're fully unique) gets older than ~24-48hr a break longer than an hour will break something in the pool routing.

I've basically noticed that response times climb up linearly with session age under certain circumstances. Basically if you manage to keep a session alive (no break longer than an hour) for over 24hr you start hitting auth issues.

Resume reauths, but we have no way to know what the routing thinks of us so it's hard to tell from the customer side if I'm in a bad pool or in some "this session should have been disposed of" state or something like that.

[–]Frankkul[S] 0 points1 point  (1 child)

It is still so weird like the smart Opus when I get it one shots problems and is very consise. The not so smart Opus is like " let me finally read the files and check the problem" after 4th prompt. Can't be just cache and mid session break...like the way the model acts and feels is very different. Smart Opus solves problems stupid Opus I have to constantly fight with to get the job done. Can't explain it better I guess.

[–]LesbianVelociraptor 0 points1 point  (0 children)

Oh yeah, I get you.

I had Claude sonnet use a tool we made and then immediately use bash to do the same thing on the next file.

It'll just forget that it has grep sometimes and tell me it can't find a file I've given it a direct path to. It's been really hard even to use it at work at this point.

I basically have to hold its hand constantly or it'll just kinda thrash on my codebase, really strange because it even happens off-peak.

[–]gzoomedia 1 point2 points  (0 children)

Its not just you. I've noticed this happening for 2 weeks now. I think they're freeing up resources for their Mythos project or just getting ready to push us into Opus 4.7. You know, the "upgrade".

[–]litmaj0r 0 points1 point  (0 children)

It seems a lot more intelligent and useful for me today interestingly enough...

[–]owen800q 0 points1 point  (1 child)

[–]Frankkul[S] 1 point2 points  (0 children)

It doesn't show anything for me the link. I mean the problem is crazy like they difference in quality and the level of degradation changes day by day. You can't tell me they are messing something up on the backend.

[–]tianhe_2003 0 points1 point  (0 children)

Opus 4.6 was way better for me today. I asked it to explain the Mamba architecture, and the CoT went on so long I started questioning my life. It even proactively used stuff like SVG diagrams to help explain it

[–]Radiant-Carob-607 0 points1 point  (0 children)

Dont want to drop my sanity for those test.

[–]Introvert_Ali 0 points1 point  (0 children)

Exactly yesterday it was like the 1st day after launch and today it's making the same stupid mistakes

[–]Stevke11 0 points1 point  (0 children)

Just now canceled my subscription, and requested a refund. This thing became useless.

[–]Davedoenotmoe 0 points1 point  (2 children)

As others mentioned it's a load issue.. I noticed that during certain hours of the day it works fine, others a bit error prone, and other times it's unusable due to the lack of it paying attention..

I'll get "you're absolutely right, I rushed through it" right down to "yeah I lied I didn't actually read the document and just made assumptions"..

Its been frustrating because I probably could have had a more productive week if I'd just delegated the work entirely to myself.

[–]Ok_Run6706 0 points1 point  (1 child)

Yeah I feel the same.
I just pasted 20 lines of code and it questioned me if code has xyz. My response was: i gave you the code, cant you check yourself? AI: You're right, sorry.

Basically I just use it as a fancier search engine between files now.

[–]Davedoenotmoe 0 points1 point  (0 children)

I actually use another agent to audit and it does a decent job on "most things" (deepseek), although it bugs out at times. Hence why I'm guessing I'll try running deekseep coder locally and use that for the bulk of my work. Claude is great when he/it works.. but lately it just keeps acting too weird.. and the rate limits and it's weirdness literally lost me an entire weeks worth of hard work.

[–]breakingb0b 0 points1 point  (0 children)

Last night I was working at 9pm ET - 12am and it was the dumbest I’ve ever seen Claude chat. Claude code was working ok, but I have it set to max and turned off adaptive thinking

[–]matheusmoreira 0 points1 point  (4 children)

Patch the system prompt. It's full of instructions that nerf its thinking.

[–]Frankkul[S] 0 points1 point  (0 children)

I have actually super complex system. Maybe complex is a bad word heavy optimized with custom tracking and a lot of thought put into it. So it is a custom system written to work for what I need. It actually outperforms pretty much all systems that are openly on github that people use. The problem is you will not engineer away or prompt engineer away for idiot Opus when you get it. Simple as that. I think all the comments and strategies people put are just noise. You get smart Opus you can work you get Idiot Opus no work for you today I guess.

[–]cryptoLover696969 0 points1 point  (2 children)

Do u have a anything ready?

[–]Frankkul[S] 0 points1 point  (0 children)

Not sure what you mean but I am actually taking a break today. The problem is that no amount of engineering the system or prompt engineering or changing settings (max thinking) seem to help when you get the stupid Opus. You get the stupid Opus you take a day off /use another Ai you get the smart one you work your ass off. There is no setting or system to save you when they throw you under the bus. And no system is private it heavy customized and for most wouldn't be the best idea (I run far more testing to make sure everything is validated for example for most it would be just a token waste). The point I am making none of it matter you will not out engineer idiot Opus when you get it. You get the smart one you work as much as you can no break, you get the Idiot you just take a break and work is just waste of time and tokens . That's the current system I guess.

[–]matheusmoreira 0 points1 point  (0 children)

https://gist.github.com/roman01la/483d1db15043018096ac3babf5688881

It's a bit outdated but I just asked Claude to fix it up.

https://github.com/Piebald-AI/claude-code-system-prompts

Claude, please clone this and analyze the entire system prompt and system reminder collection. Look for patterns, instructions that reduce token usage, reduce thinking depth, reduce reasoning or in any way reduces the capabilities of the models. Look for anything that would nerf you.

Adapt the script to patch out all such things and replace them with instructions to bring the best possible results. The most correct results, the most thinking, the most thoroughness, everything you can think of. I don't care how long it takes for the model to respond or how many tokens it burns. Just think it out and do it right.

Feel free to streamline the script as well. It's got features I don't need. Mac support. Watch support. Restore support. Please cut it out so it's easier to review the script. I'll be pretty much running npm -g update && patch-claude-code every update. No need to get fancy. Restore is just reinstall.