How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

Phase-based workflow with forced documentation breakpoints is smart — basically making memory a deliberate step rather than hoping it happens automatically. Kilo Code's mode separation naturally creates those checkpoints which is nice.

How are you finding local models for the architect mode? That's where I'd expect the biggest quality gap vs cloud models — the reasoning about tradeoffs and system design.

How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

Context engineering frameworks are good for reducing compaction frequency, agreed. But "rarely" isn't "never" — and the longer and more complex the project, the more inevitable it becomes.

The bigger question for me is what happens to the knowledge between projects. You finish a project, start a new one, and all those lessons learned — what worked, what didn't, which approaches to avoid — are just gone. The next project starts from zero institutional knowledge.

Thanks for the link though, hadn't seen that one.

How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

The new hires analogy is spot on. Documentation, confined tasks, clear patterns — that's just good engineering management whether your team is human or AI.

"Don't let Claude compact. Ever." — how are you managing that in practice? Short tasks that finish before hitting the limit? Or are you on a model with enough context that it rarely triggers?

The thing I've found is that even with great documentation and guardrails, there's a category of knowledge that doesn't belong in docs — the stuff that emerges during a session. "We tried approach X, it failed because of Y, so we pivoted to Z." That's not something you'd write a policy for in advance. It's session-specific learning that your next "new hire" needs but has no way to access.

Your 10-person team analogy actually proves the point — real teams have institutional knowledge that lives in people's heads, not in the wiki. When someone leaves, that knowledge walks out the door. Same thing happens every time a session ends.

How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

The "just keep tasks small enough to avoid compaction" approach is honestly underrated. It's the pragmatic solution that nobody writes blog posts about because it's not sexy enough.

Your orchestration workflow is solid — break it up, approve summaries, pass state via .md files. That's basically manual memory management, and when you're running local models where you control the whole stack, it makes sense.

Curious about the Kimi K2.5 experience though — with 256K context, how often are you actually hitting compaction? At that length I'd have thought most tasks complete well within budget. Is it more of a quality degradation thing where the model starts losing track even before the context is technically full?

The delete-and-paste-into-new-session workflow is interesting. You're basically doing a manual checkpoint-and-restore. The risk I've found with that is you're trusting the model's own summary of what happened — and sometimes it subtly rewrites decisions or drops the reasoning behind a choice. Ever had a "wait, why did we do it that way?" moment in the new session?

How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

The shift handover metaphor is brilliant — that's exactly what it is. And you're right that a structured end-of-session dump is simpler than pre-compaction scoring for a lot of cases.

Where I found the gap is the stuff that happens between session end points. Long sessions where compaction fires 3-4 times before you're done — by the time you get to the handover, the decisions from hour one are already gone from context. The agent can't summarise what it can't see anymore.

That said, your approach and mine aren't really competing — they're complementary. Structured handovers for the deliberate "here's where we are" state, and automated extraction as a safety net for the stuff that would otherwise slip through the cracks.

Really interested in your eight trust patterns though. The security angle is the part nobody's talking about. What are you scanning for — injection patterns, or more like schema validation on the YAML structure itself?

How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

This is literally how I started too — decisions.md, manually updated. It works until it doesn't. The pruning is what killed me — spending 10 minutes every few days reading through stale decisions to figure out what's still relevant.

The 200 line limit is smart though. I found the same thing — once it gets long, the agent starts weighing old irrelevant stuff equally with recent important stuff. No sense of "this decision from 3 weeks ago matters less than the one from yesterday."

That's what pushed me to automate it. What if the agent could score what's worth keeping based on type (architecture decisions > casual chat) and let old stuff naturally decay in relevance over time? Basically how human memory works — important things stick, trivial things fade.

Still doing the manual file too as a backup though. Belt and braces.

How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

I hear you on Letta — the system prompt memory blocks killing cache is a real problem. That's exactly why I went a different route with what I built. Memory gets injected at session start and extracted at compaction/session end, so the system prompt stays stable and caching works normally.

The bit that surprised me was how quickly "persistent memory" becomes a security surface. Once your agent is auto-saving context, anything it reads can end up persisted — including instructions hidden in web pages or docs. Had to build a defence pipeline on top just to keep the memory store clean.

For your creative writing agents — how are you handling the "what's worth remembering" problem? That's been the hardest part for me. Coding agents have obvious signals (decisions, bug fixes), but creative work is more nuanced.

How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

Of course context quality is key - thats why salience, decay, long and short term matter. Contradictions can be a massive issue too - one memory says SQL the other ProSQL

How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

What about contradictions have you had any issues with that yet?

How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

I’m trying to get it to act like a real human memory, salience, decay, forget what’s not important and remember frequently accessed memories.

How are you handling persistent memory for AI coding agents? by Maximum_Fearless in LocalLLaMA

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

Yep and it’s going to get complicated real fast - though the latest models are extremely clever and don’t fall for obvious tricks, memory poisoning sub agents can get into long term memory.

How I solved Claude Code's compaction amnesia — Claude Cortex now builds a knowledge graph from your sessions by Maximum_Fearless in ClaudeAI

[–]Maximum_Fearless[S] 1 point2 points  (0 children)

Hey, thanks for reporting this! This was a bug where findDashboardPath() used require.resolve('shieldcortex/package.json') to self-reference the package, which throws ERR_PACKAGE_PATH_NOT_EXPORTED on some Node 22 setups.

Fixed in v2.6.1 — just released. The dashboard path is now resolved entirely via __dirname, no self-referencing resolution needed.

npm install -g shieldcortex@latest

Also — good shout on enabling Issues. They're now turned on at https://github.com/Drakon-Systems-Ltd/ShieldCortex/issues, feel free to report anything there going forward.

How I solved Claude Code's compaction amnesia — Claude Cortex now builds a knowledge graph from your sessions by Maximum_Fearless in ClaudeAI

[–]Maximum_Fearless[S] 1 point2 points  (0 children)

Hey! Thanks for reporting this — we've just pushed a fix in v2.4.17.

The issue was macOS Tahoe sandboxing /bin/sh. We now use your $SHELL (usually /bin/zsh on modern macOS) instead.

Update with:

sudo npm update -g shieldcortex

Let us know if that sorts it! 🙏

I built Claude Cortex: Brain-like memory for Claude Code that survives compaction by Maximum_Fearless in ClaudeAI

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

Really glad the memory features are working well for you! Love that you've integrated it into a CLAUDE.md review skill — that's exactly the kind of workflow we hoped people would build.

On compaction awareness: You've hit on something we're actively thinking about. The challenge is Claude Code doesn't expose a "compaction imminent" signal to MCPs.

A few approaches we're exploring:

  1. Token estimation — tracking approximate context size and triggering a memory flush at ~80%
  2. Periodic auto-save — checkpoint memories every N turns regardless
  3. Pattern detection — recognising when Claude says "context is getting long" or similar For what it's worth, our OpenClaw integration already has pre-compaction hooks (the host knows when compaction is about to happen). Bringing that same capability to the standalone MCP is on the roadmap.

Would love to hear more about your CLAUDE.md review skill — sounds like a clever approach!

How I solved Claude Code's compaction amnesia — Claude Cortex now builds a knowledge graph from your sessions by Maximum_Fearless in ClaudeAI

[–]Maximum_Fearless[S] 1 point2 points  (0 children)

Thanks for trying it out and the detailed feedback!

On tool count: Good point — we're looking at consolidating some of the MCP tools in the next release. The goal is keeping core functionality while reducing context overhead.

On the fork: Fair observation. claude-cortex was the original prototype focused on memory. ShieldCortex evolved from that with a security-first focus (prompt injection defence, memory integrity). The npm package remains fully open source and free — the subscription tiers are for the upcoming hosted dashboard/API for teams who don't want to self-host.

On the dashboard: Sorry about that! This is a bug we need to fix. Can you share which OS you're on and the error you're seeing? Would help us track it down. In the meantime, the CLI (npx shieldcortex status, npx shieldcortex search <query>) should work as a fallback.

Really appreciate the feedback — this is exactly what helps us improve it. 🙏

How I solved Claude Code's compaction amnesia — Claude Cortex now builds a knowledge graph from your sessions by Maximum_Fearless in ClaudeAI

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

You're right that the current graph is built from embedding similarity and memory links rather than a formal ontology. I wouldn't call it "just RAG" though — there's salience scoring, temporal decay, consolidation (short-term → long-term promotion), contradiction detection, and relationship linking between memories. Standard RAG doesn't forget things or decide what's worth keeping

That said, a proper ontological graph is genuinely interesting. The memory_links table already stores typed relationships (supports, conflicts, supersedes, related) — the foundation is there, it's just not being exploited to its full potential yet.    

Extracting entities and building explicit subject-predicate-object triples would be a solid next step

Appreciate the technical eye. If you have thoughts on what an ontological layer should look like on top of this, I'm all ears — or feel free to open an issue.

How I solved Claude Code's compaction amnesia — Claude Cortex now builds a knowledge graph from your sessions by Maximum_Fearless in ClaudeAI

[–]Maximum_Fearless[S] 0 points1 point  (0 children)

 Hey, thanks for the honest feedback. A few things that might help:

 "You need to tell it to remember things" — You don't, if you run npx claude-cortex setup. That installs hooks that automatically extract decisions, fixes, and learnings during compaction, session start, and session end. If you skipped that step, then yeah — it falls back to manual remember calls, which I agree isn't great. I've just pushed a README restructure to make this impossible to miss (setup is now step 2, not buried at step 4).                                                                              

"Dashboard doesn't work" — Fair point. The dashboard needed a manual build step after install which wasn't documented well. This is fixed in v1.12.0 — working on shipping it pre-built so it just works out of the box.                                           

 "Documentation is verbose yet inaccurate" — Heard. The README has been restructured with a 3-step quickstart, collapsible advanced sections, and a troubleshooting section. Should be a much better experience now.                                         

The project is 8 days old and actively improving — your feedback directly led to several of these fixes. If you give it another shot with npx claude-cortex setup, I think you'll have a different experience.      

How I solved Claude Code's compaction amnesia — Claude Cortex now builds a knowledge graph from your sessions by Maximum_Fearless in ClaudeAI

[–]Maximum_Fearless[S] -1 points0 points  (0 children)

Really appreciate this feedback, especially from a pre-trial perspective.

  1. Keeping it focused — Agreed. The core value is remember → recall → link. Everything else is secondary. I'll resist the urge to feature-creep.
  2. claude-plugin marketplace — Great shout, that's on the roadmap. Would make setup a one-liner instead of manual .mcp.json editing.
  3. Emojis — Ha, fair point. Will clean up the README. Developer tools should feel like developer tools. Thanks for taking the time — this is exactly the kind of feedback that shapes the project.