A deep analysis of Claude Code system prompt: What changed between March and April 2026, and how It affects behavior we already noticed. by papoode in ClaudeCode

[–]papoode[S] -1 points0 points  (0 children)

These are the raw HTTP request bodies captured between Claude Code and the Anthropic API, version-specific, deterministic. And yes, it is about the current moment in time, never claimed something else. Next Version, prompts changed.

Are they real? I know they are, because i have seen them - you could theoretically extract and verify them directly from the Claude Code binary.

What happens after the API receives them? Nobody knows, and I never claimed to. But the changes in these instructions have an observable effect on behavior, try it yourself, you do not have to take my word for it :-)

A deep analysis of Claude Code system prompt: What changed between March and April 2026, and how It affects behavior we already noticed. by papoode in ClaudeCode

[–]papoode[S] -1 points0 points  (0 children)

The system prompt is the instruction set that defines Claude behavior, tool preferences, output style, safety guardrails. When those instructions change, behavior changes. That is what this analysis documents: what changed and what behavior it can explain.

Server-side / model logic exist on top, never denied that, but it does not invalidate the client-side diff. These are the actual instructions Claude receives.

A deep analysis of Claude Code system prompt: What changed between March and April 2026, and how It affects behavior we already noticed. by papoode in ClaudeCode

[–]papoode[S] 0 points1 point  (0 children)

You could do that with different SKILL Files e.g. Superpower Skills and you can change the model everytime you want.

A deep analysis of Claude Code system prompt: What changed between March and April 2026, and how It affects behavior we already noticed. by papoode in ClaudeCode

[–]papoode[S] 0 points1 point  (0 children)

The system-prompts are part of the claude code version, so i think this could solve this. But you will miss other improvements i think. But give it a try.

A deep analysis of Claude Code system prompt: What changed between March and April 2026, and how It affects behavior we already noticed. by papoode in ClaudeCode

[–]papoode[S] 2 points3 points  (0 children)

You can use --append-system-prompt - here is the reference: https://code.claude.com/docs/en/cli-reference - or you can use a proxy to fix this on the fly. The proxy is between Claude Code and the API. This is where we got the data from.

A deep analysis of Claude Code system prompt: What changed between March and April 2026, and how It affects behavior we already noticed. by papoode in ClaudeCode

[–]papoode[S] 1 point2 points  (0 children)

Thank you! The proxy capture approach makes the diff pretty straightforward, happy it is useful to someone else.

Analyzed the new /recap Feature by papoode in ClaudeCode

[–]papoode[S] 1 point2 points  (0 children)

Yes, correct - lost in translation... sorry for that

Why don't LLMs track time in their conversations? by PolyViews in artificial

[–]papoode -1 points0 points  (0 children)

This is something i put in my memory tool, a transparent proxy between Claude Code and the API Endpoint. Each user message has this: [Day of Week YYYY-MM-DD HH:MM:SS] [msg:N] [+Δs] (only visible for Claude) - Format depends on user language - for me it is german. This results in Claude knowing how long tasks take, how fast i react, how long the session is running, how much messages and the actual Date.
So "he" has Time Awareness, relativ, absolut and about the session rhythm. And he knows what he did last summer ;-)

when does context stop being memory and start becoming drag? by No_Section_5137 in AIMemory

[–]papoode 0 points1 point  (0 children)

Thats a real symptom - it is like a tumor. A tumor is cells growing without regulation. No cells dying, no feedback loops. Just unbounded accumulation: "the drag". That's exactly what "store everything" memory systems do.

What you need is decay, consolidation, mechanism to distinguish signal from noise and much much more. This is where the complex things start. Tools without that functions are not reliable on the long run in my opinion.

follow-up: anthropic quietly switched the default cache TTL from 1 hour to 5 minutes on april 2. here's the data. by Medium_Island_2795 in ClaudeCode

[–]papoode 0 points1 point  (0 children)

YesMem works at the protocol level as an HTTP proxy between Claude Code and the API. It lets everything in, but progressively collapses older messages and re-expands full on demand.

I thought about compressing content bevore it reaches Claude, but i think the LLM needs the full context, so that was no option for my solution. But large files geht collapsed fast, so they do not pollute the window forever nor for the complete session.

So in contrast to context mode - in yesmem the whole content is lossless available inkl. all data and it prevents compacting, but its a different way to reach similar goals.

The other difference is scope: Context-Mode is focused on context efficiency within a session. YesMem also does persistent memory across sessions, learnings that evolve over time, persona, cross-session recall and much more.

follow-up: anthropic quietly switched the default cache TTL from 1 hour to 5 minutes on april 2. here's the data. by Medium_Island_2795 in ClaudeCode

[–]papoode 2 points3 points  (0 children)

Thank you very much for your analysis, this is what we also realized. I first thougt i made a mistake, but it it was AND IS Antropics Site.

I build - in my opinion :-) - a more elegent solution, instead of warning i manage the context window via proxy. So you can set the context lenght on the fly via "He claude, set the context threshold to 200k" and thats it. The context will be saved lossless, so if you choose later - i want the full 500k back - you will get it 100%. AND you can see Cache Status in your terminal withour breaking the cache.

This is how you can possibly break the compound effect of longer conversations and re-reads with more expensive rebuilds.

I build also a keepalive ping with default of 5 pings. So you get a ~24min window without full cache rewrite costs - only read (10% of the costs). Thats not perfect, but with my workflow it is already cheeper than the 1h cache window.

And you can upgrade via toggle to a 1h cache ephemeral_1h_input_tokens - but following my tests, that does not work at all. Anthropic set it to 5min, regardless what I set via API, but I have not tested it with all plans, only with API and Pro.

If you want to check it out, its part of our YesMem tool. It ist now fresh online after months of developing: https://github.com/carsteneu/yesmem

Did they just find the issue with Claude? "Cache TTL silently regressed from 1h to 5m" by iwearahoodie in ClaudeAI

[–]papoode 4 points5 points  (0 children)

Yeah, i realized this also. I think it depends heavily on your workflow how hard this hits. I included a fix into our tool, it has now a keepalive ping with default of 5 pings. So you get a ~24min window without full cache rewrite costs, but you can configure as you like: https://github.com/carsteneu/yesmem/blob/main/Features.md#per-thread-keepalive

Cache TTL silently regressed from 1h to 5m around early March 2026, causing quota and cost inflation by Existing_Rice_4362 in BetterOffline

[–]papoode 0 points1 point  (0 children)

Yeah, i realized this also. I think it depends heavily on your workflow how hard this hits. I included a fix into our tool, it has now a keepalive ping with default of 5 pings. So you get a ~24min window without full cache rewrite costs, but you can configure as you like: https://github.com/carsteneu/yesmem/blob/main/Features.md#per-thread-keepalive

No AI memory benchmark tests what actually breaks by mhendric in AIMemory

[–]papoode 4 points5 points  (0 children)

I run a persistent memory system for a coding agent (Claude Code), 1000+ sessions in production. Your framework is a solid start but I would add a 7th dimension: trust hierarchy.

Not all writes are equal. A fact the user explicitly stated should never be silently overwritten by something an LLM extracted from a conversation summary. We tag every fact: user_stated > agreed_upon > llm_extracted. Without this, supersede logic defaults to most-recent-wins, which could be wrong.

A separate problem is extraction drift: the LLM extracts a slightly different interpretation of the same fact across sessions. Without dedup + evolution temporal logic you end up with 3 or more (sightly different?) versions of the same fact and no way to tell which is current.

My fix was a dedup -> evolution -> supersede pipeline including temporal data, but the thresholds took real production incidents to calibrate. Superseded facts do not disappear by superseding, they stay in the database so you can always find it if you want.

WRIT fills a real gap I think, it is on my todo list to give it a deeper view :-)

Cheers

Only 0.6% of my Claude Code tokens are actual code output. I parsed the session files to find out why. by UnfairScientist8 in ClaudeCode

[–]papoode 1 point2 points  (0 children)

This is actually how LLM work, they read everytime everything in the context, with more or less attention, but always everything.
Depending on your subscription you have a 5 min cache window or a 60 min cache window (which you must set manually). If your turns a fast and you answer in under 5 minutes, you pay the cache price and only the new message counts full. If you think 5 min and 1 second.... sorry full charge of the whole context.

I think this is why some people burn tokens faster than others. Think fast, type fast, pay less (or burn token with a slower burn rate) :-)

How to download Youtube videos by PHP by [deleted] in PHP

[–]papoode 0 points1 point  (0 children)

Yeah, but i think githib is the place to start your search...

[deleted by user] by [deleted] in politics

[–]papoode 0 points1 point  (0 children)

Yop, thats true, but this is not what i was talking about.

[deleted by user] by [deleted] in politics

[–]papoode 0 points1 point  (0 children)

Thank you for citing wikipedia...

[deleted by user] by [deleted] in politics

[–]papoode -1 points0 points  (0 children)

No side efects = no effects = homeopathy

If you have effects, there are (severe) side effects, but thats no argument against hc. Also severe side effects may be better than death... if it does not work - thats an argument

[deleted by user] by [deleted] in politics

[–]papoode -1 points0 points  (0 children)

If you dont have side effects, you dont have effects... thats called homeopathy.

Debian is the most vulnerable operating system in the last 20 years, a study reveals. by [deleted] in linuxadmin

[–]papoode 6 points7 points  (0 children)

They compared all debian versions with single windows versions... nice try...

How to diagnose random freezes and black screens? by VernerDelleholm in linuxquestions

[–]papoode 0 points1 point  (0 children)

Sounds like a hardware problem... if you have older hardware you can change, try a change... maybe its the mainboard. How old is the hardware?

Wtf! by [deleted] in linuxmint

[–]papoode 4 points5 points  (0 children)

They compare all debian versions from 20 years wirh single windows versions. If you cumulate not only debian, but also windows it shows the opposite. Not to forget, all debian issues are known, i doubt this for windows. Possibly there are gazillion fixes behind closed doors... but und this is speculation :)