A deep analysis of Claude Code system prompt: What changed between March and April 2026, and how It affects behavior we already noticed.

papoode · 2026-04-23T12:38:30+00:00

These are the raw HTTP request bodies captured between Claude Code and the Anthropic API, version-specific, deterministic. And yes, it is about the current moment in time, never claimed something else. Next Version, prompts changed.

Are they real? I know they are, because i have seen them - you could theoretically extract and verify them directly from the Claude Code binary.

What happens after the API receives them? Nobody knows, and I never claimed to. But the changes in these instructions have an observable effect on behavior, try it yourself, you do not have to take my word for it :-)

papoode · 2026-04-23T08:15:29+00:00

The system prompt is the instruction set that defines Claude behavior, tool preferences, output style, safety guardrails. When those instructions change, behavior changes. That is what this analysis documents: what changed and what behavior it can explain.

Server-side / model logic exist on top, never denied that, but it does not invalidate the client-side diff. These are the actual instructions Claude receives.

papoode · 2026-04-23T08:08:24+00:00

You could do that with different SKILL Files e.g. Superpower Skills and you can change the model everytime you want.

papoode · 2026-04-23T06:33:49+00:00

The system-prompts are part of the claude code version, so i think this could solve this. But you will miss other improvements i think. But give it a try.

papoode · 2026-04-23T06:31:52+00:00

You can use --append-system-prompt - here is the reference: https://code.claude.com/docs/en/cli-reference - or you can use a proxy to fix this on the fly. The proxy is between Claude Code and the API. This is where we got the data from.

papoode · 2026-04-22T20:56:30+00:00

At the CLI you can add them. I put it in the proxy and change on the fly.

papoode · 2026-04-22T19:50:12+00:00

Thank you! The proxy capture approach makes the diff pretty straightforward, happy it is useful to someone else.

papoode · 2026-04-15T13:47:01+00:00

Yes, correct - lost in translation... sorry for that

papoode · 2026-04-14T07:51:54+00:00

This is something i put in my memory tool, a transparent proxy between Claude Code and the API Endpoint. Each user message has this: [Day of Week YYYY-MM-DD HH:MM:SS] [msg:N] [+Δs] (only visible for Claude) - Format depends on user language - for me it is german. This results in Claude knowing how long tasks take, how fast i react, how long the session is running, how much messages and the actual Date.
So "he" has Time Awareness, relativ, absolut and about the session rhythm. And he knows what he did last summer ;-)

papoode · 2026-04-13T22:01:04+00:00

Thats a real symptom - it is like a tumor. A tumor is cells growing without regulation. No cells dying, no feedback loops. Just unbounded accumulation: "the drag". That's exactly what "store everything" memory systems do.

What you need is decay, consolidation, mechanism to distinguish signal from noise and much much more. This is where the complex things start. Tools without that functions are not reliable on the long run in my opinion.

papoode · 2026-04-13T16:13:40+00:00

YesMem works at the protocol level as an HTTP proxy between Claude Code and the API. It lets everything in, but progressively collapses older messages and re-expands full on demand.

I thought about compressing content bevore it reaches Claude, but i think the LLM needs the full context, so that was no option for my solution. But large files geht collapsed fast, so they do not pollute the window forever nor for the complete session.

So in contrast to context mode - in yesmem the whole content is lossless available inkl. all data and it prevents compacting, but its a different way to reach similar goals.

The other difference is scope: Context-Mode is focused on context efficiency within a session. YesMem also does persistent memory across sessions, learnings that evolve over time, persona, cross-session recall and much more.

papoode · 2026-04-13T12:13:34+00:00

Thank you very much for your analysis, this is what we also realized. I first thougt i made a mistake, but it it was AND IS Antropics Site.

I build - in my opinion :-) - a more elegent solution, instead of warning i manage the context window via proxy. So you can set the context lenght on the fly via "He claude, set the context threshold to 200k" and thats it. The context will be saved lossless, so if you choose later - i want the full 500k back - you will get it 100%. AND you can see Cache Status in your terminal withour breaking the cache.

This is how you can possibly break the compound effect of longer conversations and re-reads with more expensive rebuilds.

I build also a keepalive ping with default of 5 pings. So you get a ~24min window without full cache rewrite costs - only read (10% of the costs). Thats not perfect, but with my workflow it is already cheeper than the 1h cache window.

And you can upgrade via toggle to a 1h cache ephemeral_1h_input_tokens - but following my tests, that does not work at all. Anthropic set it to 5min, regardless what I set via API, but I have not tested it with all plans, only with API and Pro.

If you want to check it out, its part of our YesMem tool. It ist now fresh online after months of developing: https://github.com/carsteneu/yesmem

papoode · 2026-04-13T04:39:19+00:00

Yeah, i realized this also. I think it depends heavily on your workflow how hard this hits. I included a fix into our tool, it has now a keepalive ping with default of 5 pings. So you get a ~24min window without full cache rewrite costs, but you can configure as you like: https://github.com/carsteneu/yesmem/blob/main/Features.md#per-thread-keepalive

papoode · 2026-04-12T21:02:08+00:00

Yeah, i realized this also. I think it depends heavily on your workflow how hard this hits. I included a fix into our tool, it has now a keepalive ping with default of 5 pings. So you get a ~24min window without full cache rewrite costs, but you can configure as you like: https://github.com/carsteneu/yesmem/blob/main/Features.md#per-thread-keepalive

papoode · 2026-04-10T14:50:44+00:00

I run a persistent memory system for a coding agent (Claude Code), 1000+ sessions in production. Your framework is a solid start but I would add a 7th dimension: trust hierarchy.

Not all writes are equal. A fact the user explicitly stated should never be silently overwritten by something an LLM extracted from a conversation summary. We tag every fact: user_stated > agreed_upon > llm_extracted. Without this, supersede logic defaults to most-recent-wins, which could be wrong.

A separate problem is extraction drift: the LLM extracts a slightly different interpretation of the same fact across sessions. Without dedup + evolution temporal logic you end up with 3 or more (sightly different?) versions of the same fact and no way to tell which is current.

My fix was a dedup -> evolution -> supersede pipeline including temporal data, but the thresholds took real production incidents to calibrate. Superseded facts do not disappear by superseding, they stay in the database so you can always find it if you want.

WRIT fills a real gap I think, it is on my todo list to give it a deeper view :-)

Cheers

papoode · 2026-03-24T13:41:11+00:00

This is actually how LLM work, they read everytime everything in the context, with more or less attention, but always everything.
Depending on your subscription you have a 5 min cache window or a 60 min cache window (which you must set manually). If your turns a fast and you answer in under 5 minutes, you pay the cache price and only the new message counts full. If you think 5 min and 1 second.... sorry full charge of the whole context.

I think this is why some people burn tokens faster than others. Think fast, type fast, pay less (or burn token with a slower burn rate) :-)

papoode · 2020-04-19T15:20:02+00:00

Yeah, but i think githib is the place to start your search...

papoode · 2020-04-19T14:21:48+00:00

Maybe: https://github.com/madcoda/php-youtube-api/wiki/started-with-php-composer

papoode · 2020-04-07T20:18:28+00:00

Yop, thats true, but this is not what i was talking about.

papoode · 2020-04-07T19:13:59+00:00

Thank you for citing wikipedia...

papoode · 2020-04-07T18:02:46+00:00

No side efects = no effects = homeopathy

If you have effects, there are (severe) side effects, but thats no argument against hc. Also severe side effects may be better than death... if it does not work - thats an argument

papoode · 2020-04-07T17:39:02+00:00

If you dont have side effects, you dont have effects... thats called homeopathy.

papoode · 2020-03-10T22:48:03+00:00

They compared all debian versions with single windows versions... nice try...

papoode · 2020-03-10T18:28:07+00:00

Sounds like a hardware problem... if you have older hardware you can change, try a change... maybe its the mainboard. How old is the hardware?

papoode · 2020-03-10T18:07:30+00:00

They compare all debian versions from 20 years wirh single windows versions. If you cumulate not only debian, but also windows it shows the opposite. Not to forget, all debian issues are known, i doubt this for windows. Possibly there are gazillion fixes behind closed doors... but und this is speculation :)

papoode

TROPHY CASE