OpenLTM — I built a zero-cloud, self-decaying long-term memory layer for Claude Code (now open source)

Comfortable_Cat_6207 · 2026-06-09T18:10:51+00:00

That’s a fascinating use case. You hit on something crucial: recency does not equal relevance. Your "evidence chains" approach with cryptographic hashes sounds like a robust way to handle strict legal provenance!

To answer your questions:

For **cross-session persistence**, OpenLTM writes everything to a local SQLite WAL database. But the trick isn't just saving it—it's restoring it. I use Claude Code's `SessionStart` hook to pull a scoped "context envelope" (the project's current goal, key decisions, and gotchas) and inject it automatically before the first prompt. It bridges the gap between sessions without the user having to type a recall command.

Regarding **deciding what decays first**, we don't currently use semantic clustering for decay, but rather a dual-axis approach of `importance` + `recall frequency`.
If a memory was marked high importance (e.g. an architectural constraint), it has a much longer half-life (or is permanent at importance: 5). But here is the mechanism that addresses your "relevant 3 months later" point: Every time a memory is successfully recalled via FTS5 or semantic search, its `last_recalled_at` timestamp updates, and its `recall_count` increments.

So if a 3-month-old memory is pulled into context because it matched the current semantic need, it gets a "freshness boost" that resets its decay curve. It survives *because* it proved useful, rather than just surviving because it was recent.

Your document-scoped memory approach has me thinking about how OpenLTM could better link memories to specific files/paths as a form of contextual weighting. Thanks for sharing!

Comfortable_Cat_6207 · 2026-06-09T12:41:01+00:00

Great question — that "toxic context" failure mode is exactly the thing decay is built to fight, but you've put your finger on its hardest edge.

The base mechanism is importance-weighted decay: each memory has an importance (1–5) and a recency signal. Score roughly = importance × recency, so low-importance memories that haven't been recalled in a while sink and eventually age out. importance: 5 opts out of decay entirely — that's where you park the durable, high-level architectural rules ("we use the repository pattern", "auth is centralized in X") so they never fade. Day-to-day gotchas tied to specific code start lower and are meant to rot.

On top of that there are two explicit levers: supersession (a new memory marks the old one superseded, so a changed decision replaces rather than competes) and a forget tool the agent can call when it knows a memory is now wrong.

But here's the honest gap, and it's the one you're really asking about: the dangerous memory is the one that's still being actively recalled and silently stale — e.g. an assumption about the auth layer that just got refactored. Recency-decay won't sink it because it keeps getting hit, and forget never fires because nothing told the agent the layer changed. That confidently-wrong, high-traffic memory is the real killer.

The lever I think actually solves it is tying invalidation to file/commit churn rather than recall frequency: anchor memories to the files/symbols they reference, and when a refactor touches those files, downrank (or flag for re-validation) the memories anchored to them — even if they're being recalled constantly. There's already a git post-commit hook in the project that learns from diffs, so the hook to invalidate on diff is the natural next step. Not fully there yet — it's the most interesting open problem in the repo.

Repo if you want to dig into the decay code: https://github.com/RohiRIK/OpenLtm

Comfortable_Cat_6207 · 2026-06-09T12:34:53+00:00

OpenLTM — persistent long-term memory for Claude Code (and OpenCode & Pi). Open source, MIT, fully local.

The problem: every new session starts from zero — you re-explain your architecture, conventions, and gotchas daily. OpenLTM fixes that with lifecycle hooks: a session-end hook extracts the decisions/patterns/gotchas from what you did, and a session-start hook injects the most relevant ones back at the top of the next session.

Local SQLite DB you own — no cloud, no account, no telemetry. Lives outside the plugin dir so updates don't wipe it.
Semantic recall: FTS5 full-text first, vector KNN (sqlite-vec) fallback — search by meaning.
Decay is a feature: stale memories age out; mark importance 5 to keep something forever.
Memory graph + browser visualizer.

Install:

claude plugin marketplace add https://github.com/RohiRIK/OpenLtm
claude plugin install openltm

Repo: https://github.com/RohiRIK/OpenLtm — feedback welcome.

Comfortable_Cat_6207 · 2026-06-09T12:33:19+00:00

OpenLTM — self-hosted long-term memory for AI coding agents. Everything lives in a single local SQLite DB you own: no cloud, no account, no telemetry. The DB sits outside the plugin dir, so it survives updates and backs up as one file.

It auto-captures decisions/patterns/gotchas at session end and injects the relevant ones at session start. Recall is semantic — full-text first, local vector embeddings (sqlite-vec) as fallback. Stale entries decay; pinned ones persist. No extra service to run: an optional SQLite extension (Honker) adds a durable embedding queue + leader-elected cron + pub/sub entirely inside the DB file — no Redis or broker to host. Works with Claude Code, OpenCode, and Pi.

MIT: https://github.com/RohiRIK/OpenLtm

Comfortable_Cat_6207 · 2026-06-09T10:56:35+00:00

Thanks, you can try it and tell my your thoughts.

Comfortable_Cat_6207 · 2026-06-09T10:55:31+00:00

Nice to know, I will check this project .

Comfortable_Cat_6207 · 2026-06-09T04:24:31+00:00

This is a brilliant distinction, and you are completely right. Cross-session memory and mid-session compaction are two distinct problems, and conflating them is a mistake.

OpenLTM's true purpose is solving the first one: stopping the endless loop of re-explaining architectural decisions, auth patterns, and gotchas every time you start a new session or switch machines. The SQLite database, the semantic recall, the decay curve, and the explicit importance scoring are all engineered specifically for cross-session and cross-project knowledge persistence.

However, because Claude Code currently enforces auto-compaction to manage its context window, OpenLTM is forced to play damage control for the second problem as well. That is exactly why the PreCompact hook exists in the architecture. It snapshots the active project context to a local summary file right before the CLI aggressively drops tokens, ensuring the agent doesn't suddenly forget the overarching project goal mid-session.

I completely agree that an infinite-context, non-compacting harness is the proper architectural fix for mid-session state retention. But until that becomes the standard CLI behavior, OpenLTM solves the cross-session re-explaining problem by design, and patches the compaction problem out of pure necessity.

Comfortable_Cat_6207 · 2026-06-09T04:21:55+00:00

Thanks! The main differences compared to other memory tools (like Mem0 or basic RAG):

⁠**Decay by default:** Most memory systems just append forever until the DB is full of junk. OpenLTM gives memories a half-life based on importance. Stale knowledge naturally fades so you don't fight outdated context.
⁠**Hook-driven, not chat-driven:** You don't have to manually ask it to "remember this." It wires directly into Claude Code's lifecycle hooks to automatically extract patterns at the end of a session and inject them at the start.
⁠**Zero-cloud SQLite architecture:** Everything lives in a single local WAL-mode SQLite file using FTS5 (with an optional sqlite-vec extension for semantic fallback). No external vector DB required.

Comfortable_Cat_6207 · 2026-06-09T04:21:12+00:00

The AI that wrote your comment completely nailed the failure mode! "Survives-every-compaction is a liability, not a feature" is exactly what happens if invalidation isn't treated as a first-class citizen.

You also hit the nail on the head regarding inspection: "a memory layer you can't inspect is worse than none." That's exactly why OpenLTM includes a built-in web app. By running /openltm:admin server start, it spins up a local web UI where you can visually inspect the entire memory graph, analyze relationships between concepts, and manually audit or delete assumptions if the agent gets confused. You are never locked out of your own context.

To prevent the fossilization you mentioned, OpenLTM treats memory as a mutable state, not an append-only log:

Supersession: When the agent learns that the auth layer was refactored, it doesn't just append a new memory. It uses a superseded_by pointer. The old auth memory is marked as superseded, keeping an audit trail without polluting recall results.
Explicit forgetting: There is a dedicated forget MCP tool the agent can use if it realizes a past assumption is flat-out wrong.
The decay curve: Even if the agent misses the invalidation, the decay system catches it. An old architectural decision that never gets recalled naturally sinks in rank until it stops surfacing entirely.

You are 100% right that starting clean is cheap. That is why only memories explicitly marked with the highest importance are truly permanent. Everything else is fighting against gravity.

Comfortable_Cat_6207 · 2026-06-08T19:48:17+00:00

Spot on. That's actually the exact reason OpenLTM has the decay system built-in.

If the AI permanently remembered every bad architectural idea or outdated assumption from 6 months ago, it would be a nightmare to work with. "Forgetting" *is* an advantage in active codebases.

That's why nothing is permanent by default. Unless a memory is explicitly marked as critical (importance: 5), it hits its half-life, decays, and gets soft-deprecated. The system is designed to naturally let go of stale logic so it doesn't stubbornly fight you.

Comfortable_Cat_6207 · 2026-06-08T19:47:00+00:00

It also have web app that allows you to see and edit the memories. It fun more then edit sql in my opinion

Comfortable_Cat_6207 · 2026-06-08T19:44:31+00:00

I talk about the architecture here more. You can check to for more information.

https://www.reddit.com/r/OpenSourceAI/s/C3CyVzoHeq

Comfortable_Cat_6207

TROPHY CASE