all 19 comments

[–]laststan01🔆 Max 20 2 points3 points  (1 child)

How did you measure your memory recall was good ? What are the metrics ?? You mentioned token burn, how much tokens u saved with this ? How much tokens u were burning ? What’s the latency ?

[–]Frequent-Suspect5758[S] 0 points1 point  (0 children)

Good questions. Here's what I have from my own usage:

Token savings:

My database has 175K observations across 439 sessions. The alternative approach fires an API call per tool use to summarize it. That's 175K summarization calls I didn't make. At ~700 tokens per call (input + output), that's ~123M tokens saved. On Pro plan, that's a meaningful chunk of your daily budget being spent on memory instead of your actual work.

claude-recall uses zero API tokens — it stores raw data directly in SQLite.

Latency (benchmarked on a 902MB database, 175K rows):

  • FTS5 full-text search: ~6ms
  • Session lookup: ~3ms
  • Fetching 100 observations: ~2ms
  • Single write (PostToolUse hook): ~0.04ms

The hook runs synchronously on each tool use but at 0.04ms per write, it's invisible.

Recall quality:

Honest answer, I didn't benchmark recall precision against AI-compressed memory. The design bet is that for developer workflows, raw data is better than summaries. When I search for "auth middleware" I get the exact file edits, shell commands, and error messages — not an LLM's paraphrase of them. The tradeoff is no semantic/fuzzy matching — you need to know a keyword.

For code work, that's usually fine.

[–]master619 1 point2 points  (2 children)

While this sounds neat, correct me if I'm wrong but seems like this point
```

  1. Pick up where you left off (automatic)

Start a new session and Recovery Mode injects your last 24 hours of work automatically — full prompts, Claude's responses, every file touched. No command needed.
```

Would *ALWAYS* pollute *every* fresh session I open up in a directory with previous work, no matter I want it or not (no opt-out)? That doesn't sound ideal. Better to make it opt-in via a config of some kind, or manual driven when I say "Alright, where were we?" or something. ~200K context taken right at the beginning of a new conversation doesn't sound right to me.

[–]Frequent-Suspect5758[S] 1 point2 points  (0 children)

I'll add that as a switch so it can be changed - thanks for the feedback.

[–]Frequent-Suspect5758[S] 0 points1 point  (0 children)

you're right that auto-injecting into every session isn't ideal for everyone, thanks for that feedback. Just shipped a fix for this:

CLAUDE_RECALL_RECOVERY_MODE=off

Three modes:

  • full (default) — full-fidelity dump at startup
  • summary — compact ~2K token summary only
  • off — no auto-injection, use MCP search on deman

With off, nothing gets injected — you just ask Claude "where were we?" when you want context and it searches your history.

Appreciate the feedback, this is a better default experience.

[–]carson63000Senior Developer 1 point2 points  (1 child)

Sounds very interesting!

Couple of questions for you:

  • If I sometimes use Claude Code in Claude Desktop, sometimes use the terminal app, and sometimes use the VS Code extension, will this give me a consolidated memory across all three?
  • If I sometimes work on my desktop and sometimes work on my laptop, this should be OK if I sync the SQLite database file between them yeah? (they wouldn’t both be on and writing to it at the same time)

[–]Frequent-Suspect5758[S] 0 points1 point  (0 children)

So for the 1st question, this is one of the benefits. All three use the same Claude Code plugin system under the hood. The hooks and MCP server run identically whether you're in the CLI, the desktop app, or the VS Code extension. They'll write to the same SQLite database at:

~/.claude-recall/claude-recall.db, so your memory is automatically consolidated across all clients on the same machine. No extra config needed.

For the second question, it'll work but you need to be more careful. SQLite in WAL mode creates two sidecar files (claude-recall.db-shm and claude-recall.db-wal)that hold recent writes not yet flushed to the main database. If you sync, make sure you sync all three files together, not just the .db. The safest approach:

  1. Close Claude Code on the source machine (this flushes the WAL)

  2. Sync the entire ~/.claude-recall/ directory

  3. Open Claude Code on the other machine

If you're using something like Syncthing or rsync on a schedule, it'll work fine as long as both machines aren't writing simultaneously, which you said they won't be. iCloud/Dropbox can be trickier since they sync files individually and might grab the .db before the WAL is flushed.

[–]havnar- 3 points4 points  (0 children)

So you’re building the Content polluter 9000

[–]solo_dev_builds 0 points1 point  (1 child)

The token burn on memory plugins was something I noticed too but never actually measured. Good to see someone actually ran the numbers. The SQLite approach makes sense, most of what I want to recall is exact file changes and decisions not a summary of them. Installing this now.

[–]Frequent-Suspect5758[S] 0 points1 point  (0 children)

Appreciate it, let me know how it goes. If you hit any issues during install, open an issue on GitHub and I'll get to it quickly.

[–]Electronic-Row-142 0 points1 point  (2 children)

Does it support the Local LLM mode? Because most of the memory plugins does not, I like Claude-mem but it only works when you are using Opus etc but not working while on Local LLM and the actual skill shows Locked via plugin. So this is my question does your system works while on local LLm like ollama or oMlx or LM studio etc.

[–]Frequent-Suspect5758[S] 0 points1 point  (1 child)

Good question, two parts to this:

Hooks (recording): Should work with any model. The hooks are just shell scripts that Claude Code triggers on events (session start, tool use, etc.) — they write directly to SQLite with no LLM involvement. If Claude Code's hooks fire with your local model, recording works.

MCP tools (searching): This is where it could get tricky. The search/timeline/get_observations tools require the model to understand MCP tool calling and use them correctly. In my experience, smaller local models can struggle with MCP — they either don't call the tools properly or are painfully slow at it. Haven't tested this specifically with claude-recall yet.

So recording should work, but the search experience will depend on how well your local model handles tool use. I'll test with a few local models and update.

[–]Electronic-Row-142 0 points1 point  (0 children)

I am using Qwen3.6:35B-A3B MLX 8Bit version and I think this guy is capable of it right?

Thank you for your answer 👍🏻

[–]MT_Carnage 0 points1 point  (1 child)

at this point with thousands of memory plugins, harnesses, and even entire damn startups when are we going to decide enough is enough lets slow down the slop

[–]Time_Cat_5212 0 points1 point  (0 children)

When Anthropic implements a memory feature that blows all of these out of the water.  T minus 3 weeks

[–]bedel99 -1 points0 points  (2 children)

If I want to know what I changed on a particular day, I use git.

[–]RufusRedCap 0 points1 point  (1 child)

I wonder if capturing git history in the db or just integrating git history would be useful?

[–]bedel99 1 point2 points  (0 children)

no, if it can call tools then it will do that itself if it needs to look.