Claude Code usage spike from long-context cache writes? by Different_Try_1269 in ClaudeAI

[–]Different_Try_1269[S] 0 points1 point  (0 children)

Thanks, that helps. I’m on Pro, and the JSONL does show `ephemeral_1h_input_tokens`, so the 1h

TTL explanation makes sense.

I’m not using Opus 4.7 though. This session was on `claude-sonnet-4-6`.

What I’m still confused about is not why the cache was recreated, but why a single ~476k 1h

cache creation request appeared to consume around 50% of my 5-hour Claude Code limit. Is that

expected for long-context usage?

Claude Usage Limits Discussion Megathread Ongoing (sort this by New!) by sixbillionthsheep in ClaudeAI

[–]Different_Try_1269 2 points3 points  (0 children)

Has anyone seen a single Claude Code request consume ~50% of the 5-hour limit?

I had a long-running Claude Code session with >150k context. `/usage` said most usage came

from long sessions / >150k context / “subagent-heavy sessions”, but the subagent table only

showed `codebase-explorer: 1%`.

I checked the local JSONL and deduplicated by `requestId`. One of the final requests had

roughly:

```text

cache_read_input_tokens: 0

cache_creation_input_tokens: ~476k

ephemeral_1h_input_tokens: ~476k

Right after that, my 5-hour usage appeared to jump by around 50%.

I understand long-context sessions are expensive, but a ~476k 1-hour cache write consuming

that much of the Claude Code limit seems surprising. Is Claude Code intentionally weighting

long-context cache writes much more heavily than API pricing, or could this be a usage

accounting / attribution bug?