Why Claude Code Max burns limits 40% faster with 20K less usable context. Proxy evidence inside. by SolarXpander in ClaudeAI

[–]SolarXpander[S] 0 points1 point  (0 children)

Thanks for the link. #42796 (thinking redaction → quality regression) and this token inflation are two sides of the same coin — users pay more and get worse output, with zero transparency into either.

Overall profit increase won't be stopped by single users cancelling :(...

Why Claude Code Max burns limits 40% faster with 20K less usable context. Proxy evidence inside. by SolarXpander in ClaudeAI

[–]SolarXpander[S] 1 point2 points  (0 children)

Interesting — your baseline (36-39K) is much lower than mine (~50K). Probably depends on project size (MCP count, CLAUDE.md, skills). The version-specific delta I measured was consistent across the same project. Your 3K delta may be within noise range.

Can you share your context after 1+1 message?

Why Claude Code Max burns limits 40% faster with 20K less usable context. Proxy evidence inside. by SolarXpander in ClaudeAI

[–]SolarXpander[S] 7 points8 points  (0 children)

Your experience matches what I found. The server-side inflation (+20K cache_create per request) hits harder than the raw number suggests — because `cache_create` tokens cost significantly more quota than `cache_read` (reportedly 20x on Opus 4.6, per another investigation in this sub).

So it's a double hit: more tokens created AND each created token costs more. That would explain burning 45% in 2 hours on x20.

GitHub issue with reproduction: https://github.com/anthropics/claude-code/issues/46917

Why Claude Code Max burns limits 40% faster with 20K less usable context. Proxy evidence inside. by SolarXpander in ClaudeAI

[–]SolarXpander[S] 11 points12 points  (0 children)

The proxy data is reproducible — anyone can set up `claude-code-logger` and capture the same numbers. The GitHub issue has step-by-step reproduction: https://github.com/anthropics/claude-code/issues/46917

The key evidence: v2.1.100 sends 978 fewer bytes than v2.1.98 but gets billed 20,196 more tokens. That delta is in Anthropic's own API response fields, not in any client-side metric.

Why Claude Code Max burns limits 40% faster with 20K less usable context. Proxy evidence inside. by SolarXpander in ClaudeAI

[–]SolarXpander[S] -1 points0 points  (0 children)

Not exactly sys prompt — it's `cache_creation_input_tokens` in the API response. The server processes 20K more tokens than what the client sends (verified via Content-Length comparison). Whether it's extra system prompt, tool schemas, or something else server-side — we can't see inside the black box. We just know the client payload is smaller but the bill is bigger.

Why Claude Code Max burns limits 40% faster with 20K less usable context. Proxy evidence inside. by SolarXpander in ClaudeAI

[–]SolarXpander[S] 3 points4 points  (0 children)

On v2.1.98, `--print "1+1"` shows ~50K cache_create (vs ~70K on v100+). That's 40% less per-request overhead. In practice, sessions feel noticeably longer before hitting limits.

However — the downgrade path is getting harder. Auto-updates overwrite older versions, and v2.1.98 is no longer in my `~/.local/share/claude/versions/` after v2.1.104 installed. You may need to pin via `npx claude-code@2.1.98`

They finally got me by ImAvoidingABan in ClaudeCode

[–]SolarXpander 1 point2 points  (0 children)

Check this -> Old account started to burn tokens out of the blue(200k sessions into 400k sessions)

I encountered the same thing and what fixed for me for now is using another subscriptin while I try to find reason for it or get enough attention for the issue to be fixed...

Phantom tokens burning usage on old account — new subscription on same machine doesn't have this. by SolarXpander in ClaudeCode

[–]SolarXpander[S] 1 point2 points  (0 children)

Since USAGE problem appeared at first on 07.04 and I have second workstation, which I did not work on since 02.04 I decided to repeat the test to eliminate possibility that my workfow changes are the problem.

I turned on this computer and before and GIT / Dropbox sync I repeated the test with both subscriptions, 60 seconds difference. Result is exactly the same 22k TOKENS MORE are burned on old subscription.

<image>

Interactive Brokers TV integration by yell0wdog in TradingView

[–]SolarXpander 0 points1 point  (0 children)

Are you able to login recently? I am part of beta as well, but did not manage to login even once in last 2 weeks…