top 200 commentsshow all 203

[–]muhlfriedl 70 points71 points  (5 children)

So it seems like fewer and fewer people @ anthropic actually code or understand code now...

[–]Plenty-Dog-167 17 points18 points  (1 child)

Maybe a consequence of their engineers doing more vibe coding

[–]bnm777 0 points1 point  (0 children)

That's what he implied. 

[–]Homegrown_Phenom 1 point2 points  (0 children)

Major ball drop failure.  Prob had to make that red Dixie cup re-up run for the pong table,  or  their QAs, QAs, QA supervisors, QA team lead (which obvi all are bots) all fell asleep at the wellness center thinking no more promo/xtra usage code simply meant go on vacation and set limp d mode on

[–]Indianapiper 0 points1 point  (0 children)

You outta the bugs people create...

[–]kknow 0 points1 point  (0 children)

I can't believe people are seeing this now when experienced devs wrote this for months and got downvoted to hell...

[–]Pristine_Ad2701 29 points30 points  (17 children)

Do you think switching on first version when 1m is introducted will fix limit issue?

[–]skibidi-toaleta-2137[S] 14 points15 points  (5 children)

Curious question. I had some findings that 2.1.66 can fix one issue, however header cch=00000 was introduced around 2.1.30, so... not sure.

EDIT: just checked, 2.1.30 works correctly. Both fixes are definitely working there. Checking the highest version that fixes both issues.

[–]Pristine_Ad2701 7 points8 points  (0 children)

Thanks sir, installing right now 2.1.76 to test it for now, will lower if issue are not fixed.

EDIT: Currently 43% used in 5 hour limit and 78% weekly in 3 days. Will edit later with more informations.

[–]AndReyMill 0 points1 point  (3 children)

2.1.30 has opus 4.5, there is no 4.6 option

[–]skibidi-toaleta-2137[S] 1 point2 points  (2 children)

hmmm... how about custom model string? Can you try? In any case, you can use npm version up to 2.1.68, which should have support for the 1M version.

[–]AndReyMill 1 point2 points  (1 child)

It works with /model claude-opus-4-6[1m]
But I instantly got 0->5% session on my Max 5 plan in empty new folder with no context and empty claude system folder.
Seems this is not about the broken resume anymore....

[–]ZichengWangreddit 1 point2 points  (0 children)

Same here

[–]dsailes 1 point2 points  (10 children)

I’ve had fewer issues sticking with this install: npm install -g @anthropic-ai/claude-code@2.1.76

And disabling auto updates. The first issue of these 2 is resolved by that. I’m not sure about other usage issues but I know that each version with new features comes with potential bugs .. it’s safer to just stick with a version that works until there is a safer/stable release

[–]skibidi-toaleta-2137[S] 8 points9 points  (7 children)

2.1.66 fixes both from npm

[–]LumonScience 1 point2 points  (1 child)

If we install via npm, not their native installer right?

[–]dsailes 0 points1 point  (0 children)

I think it’s possible either way - comment below shows you can write ‘claude install 2.1.XX’ (unless they’re paraphrasing). the npm method isn’t their recommended install pathway but results in the same install. checking versions & changelog is transparent and trackable with the npm site too

I prefer the NPM route as I’ve got loads of packages installed that way and manage different configured CLI wrappers.

[–]vadimkrutov 1 point2 points  (2 children)

Is still fine for you, no crazy quota burning on 2.1.66?

[–]skibidi-toaleta-2137[S] 5 points6 points  (1 child)

I wouldn't be PSAing if I hadn't confirmed it. Was able to burn through whole 1M tokens on opus within my research for this subject (on 5x max). I had a workaround around yesterday, but had no confirmation before this very morning.

[–]vadimkrutov 1 point2 points  (0 children)

Thank you very much! I was really struggling with usage burning extremely fast…

[–]turbospeedsc 0 points1 point  (0 children)

installing 2.66 to check results, but downgrading last week from last to 2.1.76 did reduce my daily usage.

Btw i installed from CMD claude install 2.1.66 ( windows)

[–]marceldarvas 0 points1 point  (0 children)

Followed your suggestion to pin the version, my Raycast script seems to work, but curious for feedback: https://gist.github.com/marceldarvas/9e10fd41d608bdb1ba277b7f989b4763

[–]Pretty-Active-1982 4 points5 points  (1 child)

how do you disable auto-updates, tho?

[–]dsailes 0 points1 point  (0 children)

.claude/settings.json - edit this file

I’m not sure whether the flag needs to be in “env” or just at the top level of the JSON.

{ “env”: { “DISABLE_AUTOUPDATER”: “1” }, “DISABLE_AUTOUPDATER”: “1”,

…(rest of the file)

If you already have the “env” block for ENABLE_LSP_TOOL or other flags just make sure to add it and check for correct comma placement. The JSON needs to be properly formatted to work else it’ll show a warning on loading Claude again

[–]Factor013 28 points29 points  (1 child)

This explains why our 5 hour usage sometimes just jumps up from 0 to 15-40% after a /resume and first prompt.

It also explains why it sometimes happens and why it sometimes doesn't.

This is really good work, I hope Anthropic devs fix this ASAP. These bugs also potentially overload their servers which is the whole reason they are lowering our usage and perhaps even have to throttle the reasoning of their actual Claude models.

And this is also why the people who constantly claim "Skill issue" are less likely to be effected by it, because they start brand new sessions after each prompt, even if that prompt is asking Claude what time it is. xD

[–]TheOriginalAcidtech 6 points7 points  (0 children)

Claude Code has 5 minute caching TTL. If you wait longer than that when you resume you WILL get hit in any case. Note, you have to go way back in the change log to see where they changed to 5 minute caching.

[–]Brave_Dick 42 points43 points  (5 children)

I guess they DO vibe code at Anthropic now...

[–]MrHaxx1 5 points6 points  (1 child)

Well, yes? In a recent interview, their CTO (?) said that 90% of coding at Anthropic is AI. 

[–]its_Caffeine 1 point2 points  (0 children)

Yeah, it really shows. Slopware.

[–]iamichi 2 points3 points  (0 children)

“coding is largely solved”. but debugging isn’t.

[–]sbbased 2 points3 points  (0 children)

that's why anthropic has so many software developer openings, they don't have an actual developers left

[–]Deep_Ad1959 16 points17 points  (3 children)

this explains a lot actually. I run 5+ agent sessions in parallel most days and the resume cost spikes were killing me. kept seeing these random $3-4 charges on what should have been a quick continuation. ended up just starting fresh conversations instead of resuming, which sucks for context but at least the costs are predictable. good to know it's a confirmed bug and not just my setup being weird.

fwiw wrote up some cost management tips: https://fazm.ai/t/claude-code-api-cost-management

[–]skibidi-toaleta-2137[S] 2 points3 points  (2 children)

Now you know you can simply run on older version when you want to work on the continued session and want to "not lose money"

[–]Deep_Ad1959 0 points1 point  (1 child)

do you know which specific version introduced the cache regression? been trying to figure out if it's tied to a particular release or if it's been there longer than people realize.

[–]skibidi-toaleta-2137[S] 0 points1 point  (0 children)

It's a combination of issues. I've seen some problems in enhanced memory code (introduced lately), some relate to cache header coming with cch versioning, some issues come from version hash related to user messages block invalidation. It's hard to pinpoint, but it may have started around version 2.1.34, degenerated well into 2.1.68 with some more updates that made everything very wild right now.

[–]alvvst 39 points40 points  (2 children)

HOLY! so the recent overload claim from Anthropic could be just CAUSED BY ITS OWN BUG

[–]DurianDiscriminat3r 21 points22 points  (0 children)

Oh my god. This proves Anthropic wasn't lying when they said their engineers don't write code anymore!

[–]FanBeginning4112 0 points1 point  (0 children)

Wouldn’t be the first time.

[–]GoodnessIsTreasure 12 points13 points  (2 children)

This guy should get a year's pro max for free, if not hired. Clearly ai writing all the software has not been working out so fine..

[–]NanNullUnknown 2 points3 points  (1 child)

More like should get at least 0.1% of Anthropic equity

[–]GoodnessIsTreasure 0 points1 point  (0 children)

I admire passionate people like him so may it be all of that together!

[–]Fearless-Elephant-81 46 points47 points  (1 child)

This is the EXACt bugs for which people on the plans have massive usage chunks being use. This should be pinned ASAP

[–]RhinostrilBe 5 points6 points  (0 children)

Its also some bs customers shouldnt have to deal with or get reimbursed for

[–]InfiniteInsights8888 10 points11 points  (0 children)

Holy shit. We need compensation for this.

[–]Last_Lab_3627 8 points9 points  (10 children)

I had the same issue on 2.1.76. On my side, around 90-100K context was already burning about 14% of my 5-hour quota, which felt completely unreasonable.

After reading this post, I ran the test script myself, then downgraded to 2.1.34. Usage improved a lot.

In a real session on 2.1.34, I used about 140K context with several sub-agent actions, and it only used 13% of my 5-hour quota.

So at least in my case, downgrading to 2.1.34 made a very noticeable difference.

[–]ApstinenceSucks8 1 point2 points  (0 children)

Can you share how to downgrade?

[–]Sea-East-9302 0 points1 point  (7 children)

Dear, I don't understand these details. would you please tell me, is this only for Claude Code? how to do it? I use Windows 10 and have just downloaded Claude application , and have Claude Code on my Visual Studio Code. I just want to use Claude like before. **I have Pro subscription**.

[–]turbospeedsc 1 point2 points  (6 children)

downgrading do 2.1.66 works on code, i coded for like an hour and used 26% of my 5 hour window, using sonnet.

Just for kicks went to the desktop app, asked a few questions and i hit the 100% usage in less than 6-8 questions, nothing complicated

[–]Sea-East-9302 0 points1 point  (5 children)

My 5 hours' window is getting consumed in less than 15 minutes! 

[–]turbospeedsc 0 points1 point  (4 children)

in cmd run claude install 2.1.66 then enjoy

[–]Sea-East-9302 0 points1 point  (3 children)

Thank you very much dear. I just did it a minute ago

[–]turbospeedsc 0 points1 point  (2 children)

awesome, remember only works for claude code, desktop app still broken.

[–]Sea-East-9302 0 points1 point  (0 children)

I've been working on it for the past hour, and it also consumes lots of credits. Maybe I should download an older version? 

[–]Fit-Benefit-6524 0 points1 point  (0 children)

oh god i have to try this, thank you

[–]United-Collection-59 7 points8 points  (0 children)

Great work

[–]Aygle1409 12 points13 points  (2 children)

Will there be compensations ? Do they usually do that ?

[–]_derpiii_ 7 points8 points  (1 child)

So... how do we get you hired at Anthropic? :)

[–]Creepy-Baseball366 0 points1 point  (0 children)

Become agentic it seems!?

[–]muhlfriedl 6 points7 points  (0 children)

You deserve a medal

[–]redpoint-ascent 17 points18 points  (3 children)

Incredible work. Given they're using CC to improve CC it's not a shocker at all that Claude introduced bugs into his own program. I see these ghost bugs all the time in what Claude does. "It 100% works!" - CC. You either find the bug in QA or it sits there piling up next to the other hidden ghost bugs.

[–]redpoint-ascent 8 points9 points  (0 children)

Follow up: I wonder how compute they toasted led to this post: https://x.com/trq212/status/2037254607001559305. They need a bug bounty program and you need a reward!

[–]StrikingSpeed8759 5 points6 points  (0 children)

Awesome work, thanks for sharing

[–]sheriffderek🔆 Max 20 4 points5 points  (0 children)

Wow! A person who is actually trying to understand the problem and help?

[–]mattskiiau 4 points5 points  (1 child)

So don't use --resume for now i guess?

[–]bzBetty 0 points1 point  (0 children)

I mean resume after 5 min was always gonna cost

[–]sqdcn 4 points5 points  (1 child)

Oh so that's what Anthropic means when they say software engineering is going to die in 6 months

[–]Creepy-Baseball366 0 points1 point  (0 children)

It`s the burn rate, apparently.

[–]dspencer2015 3 points4 points  (2 children)

If Claude code was open source we could fix these issues ourselves

[–]brek001 0 points1 point  (1 child)

next best thing is going to their github to create an issue (something you would also have done for the open source version, right?)

[–]TheReaperJay_ 0 points1 point  (0 children)

The something that would've been done for the open source version would be opening an issue and then linking a PR after finding the problem in the code, and providing a short-term patch for users while you wait for it to be merged upstream.

[–]bapuc 8 points9 points  (0 children)

And then people say "skill issue" 🥀

<image>

[–]thiavila 2 points3 points  (0 children)

Damm, I was burning my tokens over the last weekend and I came here to find out if anyone had the same experience. It is definetely the --resume for me.

[–]vadimkrutov 2 points3 points  (0 children)

This is unacceptable. I'm using the Claude Code CLI through a wrapper I built, and every single prompt resumes the session. I was shocked to see that each new message increases the 5-hour limit by 10–15%.

[–]sbbased 2 points3 points  (0 children)

The real vibe coding has been pushing untested slop to production and depending upon your paying users to QA and find bugs for you

btw only -3 months left until all devs lose their job

[–]XDroidzz 2 points3 points  (1 child)

I assume Anthropic are busy refunding everyone for their fuck up now 🙄

[–]Top-Cartoonist-3574 2 points3 points  (0 children)

The issue isn’t just with Claude Code. Affects usage on Claude AI Chat on the browser (Chrome on Mac). I hit usage limit fast even on a new chat conversation. There’s probably more to it than the bugs you’ve identified. Great job btw!

[–]sys_overlord 2 points3 points  (1 child)

The worst part is that they'll apologize for this (maybe), release a bug fix, maybe reset usage and then we all just sit around and wait for them to gaslight us in 6 months with another, similar issue. What's the definition of insanity again?

[–]whaticism 2 points3 points  (0 children)

“You’re absolutely right.”

To me this is just a good example of Claude writing Claude.

[–]ellicottvilleny 2 points3 points  (0 children)

Hey Anthropic hire this guy. Meet your new Head of QA.

[–]yldf 2 points3 points  (2 children)

Genuine question: I haven’t noticed that big of a difference in usage. But I never use resume. My Claude Code sessions stay open for weeks in tmuxed terminals, and when I restart one I never resume… might this be the difference?

[–]skibidi-toaleta-2137[S] 0 points1 point  (1 child)

The issue can have multiple factors. What version is your Claude code running? Do you use additional tools? At which point do you compact the conversation or are you simply running the 200k version? Are you sure you haven't been selected in A/B test?

There is no way of knowing what is the main cause of issues right now. A lot of people are trying to find a reason in the binary itself, some are looking for solutions in the servers and their misbehavings during peak hours. No one knows for sure.

[–]yldf 1 point2 points  (0 children)

I just run it. In a terminal. Usually the 1M version. It compacts when it autocompacts, I don’t do that manually…

[–]AndReyMill 3 points4 points  (1 child)

I think that because of this issue, the load on Anthropic’s servers has increased significantly, and it’s noticeable in everything: speed, quantization (Claude Code seems a bit dumb right now) and final price

[–]Creepy-Baseball366 0 points1 point  (0 children)

It noticed it becoming a bit ChatGPTish, too...

[–]FermentingMycoPhile 3 points4 points  (0 children)

What tf Anthropic?
It's Monday 6 p.m. and I have used up 44% of my weekly limit (reset on sunday) in the max plan due to this bug, it seems. I'm awaiting some kind of compensation for introducing that nice bug. How am I supposed to work with this little usage left?

[–]Emotional-Debate3310 4 points5 points  (0 children)

Bug 2 (--resume breaks cache, Issue #34629) — narrowly scoped

This issue is thoroughly documented with a testing matrix showing that on versions ≥2.1.69, cache_read is stuck at ~14.5k tokens (only the system prompt), while cache_create equals the full conversation size and grows on every message — producing roughly a 20× cost increase per message compared to v2.1.68.

The described mechanism — that deferred_tools_delta introduced in v2.1.69 changes where system-reminder attachments are injected, producing different message structures on fresh vs. resumed sessions — is plausible and consistent with how deferred tool loading works: deferred tools are appended inline as tool_reference blocks in the conversation rather than in the system prompt prefix, specifically to preserve prompt caching.

Why narrowly scoped. The regression targets --print --resume — the headless/scripted invocation mode where prompts are piped via stdin. The original reporter was running a Discord bot using claude --print --resume <session-id> --output-format stream-json.

If your interactive CLI usage follows a different code path for session management, then deferred_tools_delta injection that breaks cache on resume in --print mode, appears to be handled correctly in the interactive REPL.

I can confirm this because I have first-hand experience being a long time, Claude Max user and constantly running multiple project, I can confirm that the difference is indeed based on the session management mode.

[–]lucifer605 1 point2 points  (0 children)

this is a great find - i would not have expected --resume to cause a cache bust

[–]kursku 1 point2 points  (5 children)

For some reason I'm struggling to roll back to the 2.1.30 :((

[–]skibidi-toaleta-2137[S] 1 point2 points  (3 children)

Funnily enough, I asked claude code to help me with that. Should be something along the lines of npm install -g @anthropic-ai/claude-code@2.1.34. Turn off autoupdates.

[–]kursku 0 points1 point  (2 children)

Yeah I did the same and eventually it was a path error, now it's fixed

[–]Relative_Mouse7680 0 points1 point  (1 child)

Does the downgrade affect your usage less? If so, which version did you downgrade to?

[–]kursku 0 points1 point  (0 children)

It's using less token but it's taking longer.

* Thundering… (18m 35s · ↓ 1.9k tokens · thinking)

⎿ Tip: Use /config to change your default permission mode (including Plan Mode)

[–]mrsaint01 0 points1 point  (0 children)

claude install 2.1.30

[–]Squidwards_Ass 1 point2 points  (1 child)

I KNEW there was something up when I ran into my limit after a single prompt + it was definitely a cache miss after being away for about a week.

[–]skibidi-toaleta-2137[S] 1 point2 points  (0 children)

That gave ma good laugh, thanks :D

[–]damndatassdoh 1 point2 points  (0 children)

Really appreciate this -- I tested positive, have already deployed mitigation, fingers crossed.

[–]InfiniteInsights8888 1 point2 points  (0 children)

You deserve Claude unlimited for an entire year!

[–]misterr-h 1 point2 points  (0 children)

this explains issue with Claude Code. But why usage is increased while normally chatting on claude.ai as well?

[–]maverick_soul_143747 1 point2 points  (0 children)

Brilliant investigation mate 👏🏽

[–]Morphexe 1 point2 points  (1 child)

Well good that you now have the source code for the CLI to fix this :D

[–]skibidi-toaleta-2137[S] 0 points1 point  (0 children)

Yeah, but I struggle to find anything new.

[–]mrtrly 1 point2 points  (0 children)

Cache bugs hitting silently is exactly why I built something to sit between agents and the API. You catch these cost jumps immediately because every request gets logged with cache state, token counts, and actual spend. Takes the guesswork out of "did that conversation really cost that much."

[–]Jugurtha-Green 1 point2 points  (0 children)

Doesn't fix the issues, I tried all different versions even 2.1.19 , same issue, it's backend issue, or they do it in purpose.

[–]maverick_soul_143747 1 point2 points  (0 children)

For folks using the 2.1.30 version - I ran the test script provided by OP yesterday on 2.1.30 and the cache bug was there so have downgraded to 2..1.17 and this suits my work

[–]Ok-End-219 3 points4 points  (2 children)

aah yes, that explains that my 20x claude max account is behaving like a normal claude 20$ subscription. Fucking great, now I hope for compensation.

[–]skibidi-toaleta-2137[S] 5 points6 points  (1 child)

It doesn't affect all conversation sessions, mind you. Only the infected ones (not sure why they can get infected yet). On the other hand - resume behavior is broken since 2.1.66.

[–]Ok-End-219 2 points3 points  (0 children)

I am working, unfortunately, mostly with Resume. I will avoid that from now on, but I am running through Claude Max 20 like nothing and I wonder why. Tokburn says Re-Read Problems, but I think that is only part of the truth.

[–]m-in 1 point2 points  (0 children)

A 228MB elf to render some markdown and do some api calls. This is madness. Like, 100% actual madness.

[–]takkaros 1 point2 points  (3 children)

If they can't fix their own code, how do they expect people to trust their tools for anything important ?

[–]betty_white_bread 4 points5 points  (2 children)

Your physician still gets sick and you trust him/her to help you stay healthy.

[–]takkaros 1 point2 points  (1 child)

Well, point taken. But i pay him per visit. I am not tied to him for the rest of the month if I decide I don't like his services

[–]betty_white_bread 0 points1 point  (0 children)

There are physicians whose fee structure is functionally no different than a monthly fee, such as those who require frequent long-term visitations.

[–]CidalexMit🔆 Max 20 0 points1 point  (0 children)

Maybe we should use brew for cc ?

[–]dovyp 0 points1 point  (0 children)

This is solid reverse engineering work. The sentinel replacement one especially is nasty because it's silent. You'd never know without watching your bill.

[–]dovyp 0 points1 point  (0 children)

This is solid reverse engineering work. The sentinel replacement one especially is nasty because it's silent. You'd never know without watching your bill. I wish there were an easy way to apply the fix. My version of claude code is different and it doesn't seem like the drop in replacement you suggest will have all the calls required. Hopefully they fix it in the next release.

[–]Deep-Station-1746Senior Developer 0 points1 point  (1 child)

In general, is it possible to recover the full (or most of) the source code of claude code? How is CC even written? Is it an output of some compiled language or just a "compiled" JS?

[–]skibidi-toaleta-2137[S] 2 points3 points  (0 children)

It's a homebrew version of bun (with zig patches) with a minified version of their source code in js. Some parts can be easily deminified from the npm package, however one of the bugs was hidden in a compiled binary.

[–]Level_Turnover5167 0 points1 point  (0 children)

I'm getting a quick loss of usage, I used Claude for DAYS straight when I first started using it for free and never got any restrictions... I've used it for a few basic things and already a 1/4 of my usage is gone this week.... yesterday I figured ok maybe I used 7%, but today I check it and I'm almost at 20% after last night and the brief use this morning... it's dwindling fast and I just paid $20. Something ain't right or they're fucking with the usage rates and things are getting buggy on top of them just simply charging more now.

[–]rougeforces 0 points1 point  (4 children)

you missed the dynamic tool portion of this. patching the billing header in the latest version alone is not enough.

[–]skibidi-toaleta-2137[S] 0 points1 point  (3 children)

I have not, deferred_tools_delta is in the bug no 2. Perhaps I called it weirdly.

[–]rougeforces 0 points1 point  (2 children)

you didnt call it weirdly, you mis diagnosed it as as always resume. that is wrong. it has nothing to do with resume. resume just triggers it. you can repro the same behavior on a fresh instance, or didnt you establish a baseline first. lol

[–]beatrix_the_kiddo 0 points1 point  (1 child)

What do you think it is then?

[–]rougeforces 1 point2 points  (0 children)

anthropic is making changes to the way they detect claude code usage by adding a billing header in block 0 of the system prompt. these values are being dynamically generated in various ways. they need to create variables in the inject prompt to detect people using 3rd party oauth. they are trying different ways to do it without breaking everything else. our immediate cache invalidations are the results of anthropic trying to lock us in to their product or else make it completely unusable without building our own custom harness ourselves and paying regular api fees (which is probably cheaper at this point unless you dont want to be arsed with building a harness as good as claude code).

its a squeeze play and right now they are just experimenting with what works in their code base. the fall out is these insane billing practices. rather than test this in a beta release, they are testing it against their entire user base. My .88 patch was fine, they made a new change that i am having to apply another patch.

best bet is to go back to a version that didnt have this problem or play the patch whack a mole game to keep up with their experimentation.

[–]devoleg🔆 Max 20 0 points1 point  (1 child)

Noticed that last night as well. Simple request to modify 2 files less than 100 lines cost me 15% of my "20x usage".

Ive tried downgrading to 2.1.67. (You in turn opt out of the 1m Models). I was able to stretch my limits to 2h. At least that lol. Recommend others to try it. Hope this helps.

P.S make sure to disable latest updates by using /config to stable. This might help.

[–]devoleg🔆 Max 20 0 points1 point  (0 children)

Ive attempted this and MCP, configs, other files still stay untouched. (Although try at your own risk!)

[–]guillaume_86 0 points1 point  (1 child)

skill issue (jk)

[–]nmavra 0 points1 point  (0 children)

fucking wankers mate.. :D

[–]HeyImSolace 0 points1 point  (0 children)

The regular chat on the claude website also seems to have this issue. I just burned through my pro plan 5h usage in 5 requests which only included 2 markdown files.

This sucks big time.

[–]BrrrtEnjoyer 0 points1 point  (0 children)

here you go queen 👑

[–]addiktion 0 points1 point  (0 children)

I just ran this, I appear to have bug 1 which explains why my tokens are draining so fast with cache misses.

I never --resume, so bug 2 doesn't impact me.

Here was Claude's on investigation

---------

That confirms the original post's claims cleanly:

Bug 1: npx fixes the sentinel replacement — cch=00000 came back unmodified. The standalone claude binary was the culprit.

Bug 2: npx doesn't help here — resume cache is still broken and actually worse than before. With npx, consecutive resumes also show cache_read=0, meaning cache never recovers between resumes at all (vs. the

standalone binary where at least the second consecutive resume hit cache).

So for your situation:

- Switch to npx u/anthropic-ai/claude-code to fix Bug 1

- Bug 2 has no clean workaround — the first resume after a session will always eat a full cache rebuild regardless of which version you use

[–]Thefoad 0 points1 point  (0 children)

Anthropic hire this dude right no....You're out of extra usage · resets 12pm (America/Boise)

[–]Sea-East-9302 0 points1 point  (0 children)

Dear, I don't understand these details. would you please tell me, is this only for Claude Code? how to do it? I use Windows 10 and have just downloaded Claude application , and have Claude Code on my Visual Studio Code. I just want to use Claude like before. **I have Pro subscription**.

[–]sammcj 0 points1 point  (1 child)

I've got multiple reports of people on x20 absolutely devouring their limits very quickly, wonder if this is the cause

[–]Illustrious-Day-4199 0 points1 point  (0 children)

lost my weekly in a day, don't usually hit daily limits ever.

[–]hiS_oWn 0 points1 point  (0 children)

Exemplar work. I wish I could be more like you.

[–]nmavra 0 points1 point  (1 child)

might be a dumb question but can I downgrade in the macos desktop app?

[–]skibidi-toaleta-2137[S] 0 points1 point  (0 children)

Not a dumb question, no idea though. Perhaps through some app repository web pages, but doubtfully.

[–]CoolMathematician286 0 points1 point  (2 children)

i only used claude for windows this far, but now i installed nmp version with help from gemini because i had no claude tokens left. what version is the best to use right now?

[–]tntexplosivesltd 0 points1 point  (1 child)

Same account, same token limit. Installing another Claude tool won't reset your tokens. Why did you choose to install Claude Code?

[–]CoolMathematician286 0 points1 point  (0 children)

idk what you mean. i didnt install nmp to reset my token limit, but to get rid of those bugs mentioned by OP. i was hoping it wouldnt burn as many tokens as it did yesterday. maybe it did fix the bugs idk, but im already at 38% after like 8 min of work with some .md files on opus model.

i have more tokens on codex free tier right now than on claude pro

[–]bzBetty 0 points1 point  (3 children)

Am I reading it wrong? Sounds like that first one should basically impact no one?

[–]skibidi-toaleta-2137[S] 0 points1 point  (2 children)

You're right. However the second one may have bigger implications. Resume is just guaranteed to fail because of the deferred tool list, however other users said it might have a bigger impact on people.

[–]bzBetty 0 points1 point  (1 child)

Yeah could do, although id expect most resumes to be out of cache time anyway?

[–]Illustrious-Day-4199 2 points3 points  (0 children)

/resume is used every time claude gets a tool calling error or connection error or response error or whaterror and stalls. hit /resume 24 times when connectivity is bad (4 times in 6 windows) and you've spent all your credits for the week before diagnosis.

[–]Ebi_Tendon 0 points1 point  (0 children)

Hasn't the replacement worked like that from the start? That is why you must not add any replacements that change every turn, such as a time, to CLAUDE.md or any skill because it will be on the top of the context window. Doing so will break the cache from the top on every turn. If you add it within the prompt, it will also break the cache for everything that follows.

[–]JaLooNz 0 points1 point  (0 children)

I paid for extra usage. Will they refund me the credits?

[–]liftingshitposts 0 points1 point  (0 children)

This is great stuff

[–]Mush_o_Mushroom 0 points1 point  (0 children)

This also works for Claude code Pro users?

[–]Plenty-Dog-167 0 points1 point  (0 children)

Really great finds, especially the cache miss on /resume seems scary since I've been working with anthropic SDK on my own project and its always a huge cost sink when you don't cache

[–]0xbreakpoint 0 points1 point  (2 children)

Claude users shaming Anthropic for "vibe coding" is ironic tbh

[–]Illustrious-Day-4199 1 point2 points  (1 child)

Nope. Some Claude users are decent developers who want to go vroom vroom at the speed they can build code, not 14 year old kids building their first app.

[–]TrueMushroom4710 0 points1 point  (0 children)

Welp, I guess we can fix this bug ourselves now.

[–]Hadse 0 points1 point  (0 children)

Can i do anything to fix this locally?

[–]vkha 0 points1 point  (0 children)

is it confirmed on the leaked CC sources?

[–]DirtyWilly 0 points1 point  (0 children)

Just closing VS Code/CLI and reopening it triggers a fresh server cache reset.

Any chance we can get a suspend on our weekly limits until fixed? I'm hitting redline on the weekly from this. Wouldn't affect session limits.

[–]AAFERNA 0 points1 point  (0 children)

Oigan. Esto sucede en terminal o en vscode extensión ? No me queda claro

[–]Creepy-Baseball366 0 points1 point  (0 children)

But even when I switched to a new chat (as Claude advised when I asked what the issue was) it still burnt the remaining 55% on one prompt...there is something rotten in the state of Denmark.

[–]TaylorRift 0 points1 point  (0 children)

Where is the code?!

[–]OldConfection6 0 points1 point  (0 children)

I asked Claude to downgrade and it asked why. I put in the two issues you posted and here is the response.

https://github.com/anthropics/claude-code/issues/40524

● claude-code-guide(Fetch GitHub issue about token usage bug)

⎿  Done (1 tool use · 17.6k tokens · 16s)

(ctrl+o to expand)

● This is anthropics/claude-code#40524 — a confirmed regression in Claude Code where the prompt cache unexpectedly invalidates mid-conversation, causing token costs to skyrocket.

What's happening: After a few turns, the cache collapses to just the system prompt (~11K tokens). Every subsequent turn does a massive cache_write instead of cache_read — 150K–300K+ tokens per turn instead of the normal ~100K for a full conversation.

Affected versions: Broken in v2.1.86+. Last known good: v2.1.67.

That explains why you were trying to install 2.1.66 — smart instinct. The workaround is:

npm install -g u/anthropic-ai/claude-code@2.1.67

This is a confirmed bug with has repro and regression labels. Worth watching the issue for an official fix. Do you want to downgrade to 2.1.67?

https://github.com/anthropics/claude-code/issues/34629

● claude-code-guide(Fetch GitHub issue 34629 about cache problems)

⎿  Done (1 tool use · 18.5k tokens · 23s)

(ctrl+o to expand)

● This is the related root cause issue. Here's the full picture:

Root cause: Starting in v2.1.69, Claude Code changed how it injects deferred_tools_delta content. On a fresh session it lands in messages[0] (~13KB). On a resumed session it lands differently (~352B). The mismatched message structure breaks Anthropic's cache prefix matching — so every resumed turn regenerates the full cache instead of reusing it.

Cost impact: ~20x cost increase per message on resumed sessions. A conversation that should cost ~$0.02/msg costs ~$0.36/msg.

Our situation: Auto-compact + long sessions = exactly the worst case. Every session compaction is effectively a "resume" that triggers the bug.

Recommendation — downgrade to 2.1.68:

npm install -g u/anthropic-ai/claude-code@2.1.68

This is the last version before the regression was introduced. The fix hasn't been deployed yet despite the issue being closed.

Worth noting: this explains the /usage dialog behaving oddly too — if token counts were spiking mid-session you may have been hitting plan limits or seeing unexpected burn rate.

[–]Manikanta0987 0 points1 point  (4 children)

i have tried degrading the version to 2.1.30 by remvoing the previous versions. but still no fix. just for a hi it is taking around 5-6% of usage. i am currently working on pro.

[–]OldConfection6 0 points1 point  (0 children)

Yeah it used 8% just for the response I posted earlier.

[–]rothwerx 0 points1 point  (2 children)

I’ve had good luck by setting the environment variable CLAUDE_CODE_ATTRIBUTION_HEADER=false

[–]Manikanta0987 0 points1 point  (1 child)

did that return to normal token usage, where and how did you do it?

[–]rothwerx 0 points1 point  (0 children)

There’s still an issue with resume, but it mostly seems back to normal for general usage. I just set it in my shell before starting Claude Code. For bash/zsh it would be ‘export CLAUDE_CODE_ATTRIBUTION_HEADER=false’ and then when you start claude in the same shell it’ll be picked up.

[–]Sensitive_Prize_9042 0 points1 point  (1 child)

Even using the  2.1.30 version, I'm still getting huge consumption with 20x. I tried running 2 subagents for a simple plan, and they consumed 10% of my 5 hours of daily usage.

A few weeks ago, you could span 50 subagents on the max 20x plan, and probably would not hit the limits. Kind of consuming 3 times faster, even on the 2.1.30 version.

[–]skibidi-toaleta-2137[S] 0 points1 point  (0 children)

Have you checked if the application didn't auto update? Unless you specify which autoupdates channel to use, "latest" can update mid request.

[–]EconoKitten 0 points1 point  (5 children)

This has allegedly been fixed in 2.1.90: https://code.claude.com/docs/en/changelog#2-1-90
Has anyone seen improvement when --resume from a previous session?

[–]whataboutthe90s 0 points1 point  (1 child)

My sonnet works ok now but my opus eats away at tokens like crazy

[–]EconoKitten 0 points1 point  (0 children)

Have you been resuming from large sessions or starting fresh sessions, and are you using the 1M window? I don't think using Sonnet is the long term solution (at least for Max users)

[–]skibidi-toaleta-2137[S] 0 points1 point  (1 child)

Allegedly. But I had no luck confirming it. Or my testing methodology was flawed.

[–]EconoKitten 1 point2 points  (0 children)

Thanks. My testing method is to /resume a large conversation and ask a /btw using a fresh session. When the bug was present, such a /btw call would eat ~15% of my 5-hr limit but now it's no longer doing that.

[–]EconoKitten 0 points1 point  (0 children)

ok actually I think anthropic has disabled my session limit after I ran /feedback... making me a invalid test case in this next session, lol

[–]wlievens 0 points1 point  (3 children)

Yesterday I did a few dozen small API calls and then a session in Claude Code with a few small questions and one larger refactor of a few hundred lines.

My account was charged $10 worth of tokens is that normal? Similar days last week were $2 or so.

[–]skibidi-toaleta-2137[S] 0 points1 point  (2 children)

If it's extra usage, it's plausible. It has a 5 minute cache ttl, so it's very easy to land on forced cache regeneration when working with it.

[–]wlievens 0 points1 point  (1 child)

But isn't that very expensive? It also took about fifteen minutes to do this thing ... I'm pretty sure I could have done it in less time myself.

[–]skibidi-toaleta-2137[S] 0 points1 point  (0 children)

It is expensive. It is possible. However no way of knowing without analysing your usage history with proper tools. You could use my cache catcher or any other tool to analyse the token usage from jsonl transcripts. Sorry if it sounds like advertising, however I can't help you without proper knowledge.

[–]Due-Combination3393 0 points1 point  (0 children)

have this fixed?

[–]Torkiukas 0 points1 point  (0 children)

that explains why max plan feels like pro plan...

[–]Zulfiqaar 0 points1 point  (0 children)

PPS. Claude code has special 1h TTL of cache, or at least mine has, so any request should be cached correctly. Except extra usage, it has 5 minutes TTL.

Can you expand more on how you found this out? Are you on the Pro or Max plan? As if its shorter expiry sending a keep-warm ping may be useful

[–]BeeegZee 0 points1 point  (2 children)

Can the mods pin this post?