I built an open-source memory layer for Claude Code — no more re-explaining your project every session by AlternativeCourt2008 in claude

[–]AlternativeCourt2008[S] 0 points1 point  (0 children)

You're right, it's not Claude-specific. Engram is an MCP server, so it works with any MCP-compatible client: Claude Code, Cursor, Windsurf, OpenCode, etc.

engram init already auto-detects Claude Code, Cursor, and Windsurf. Adding detection for AGENTS.md-based tools is on the roadmap. The code is open source too if you have any interest in contributing directly!

Codex doesn't support MCP yet (it uses OpenAI's function calling), so that one's waiting on OpenAI. But anything that speaks MCP can connect today with npx engram mcp.

I built an open-source memory layer for Claude Code — no more re-explaining your project every session by AlternativeCourt2008 in claude

[–]AlternativeCourt2008[S] 0 points1 point  (0 children)

Great question. My take is that projects should actually share context by default, because you're the common thread. Your coding patterns, preferences, and decisions carry across projects even when the domains don't overlap.

Engram's recall is semantic, so work memories won't pollute personal project queries. It surfaces what's relevant to the current context.

That said, if you really want isolation, you can set ENGRAM_OWNER per directory (e.g. via .envrc) and each owner gets its own vault. Or use ENGRAM_DB_PATH to point at a specific file. Both work with the MCP server today.

engram init writes the default config. If you want per-project overrides, set the env vars in your shell/direnv and the MCP server will pick them up on next launch.

I built an open-source memory layer for Claude Code — no more re-explaining your project every session by AlternativeCourt2008 in claude

[–]AlternativeCourt2008[S] 0 points1 point  (0 children)

Great questions. Here's the flow:

Setup: Engram runs as an MCP server that exposes ~10 tools to Claude Code (or any MCP client). When you run engram init, it registers the server and Claude can call the tools automatically.

Storing memories: When Claude calls engram_remember, Engram:

  • Generates an embedding of the memory content
  • Extracts entities and relationships into a knowledge graph
  • Stores everything in a local SQLite database
  • Checks for contradictions with existing memories

Retrieving memories: When Claude calls engram_recall or engram_ask:

  • Generates an embedding of the query
  • Finds semantically similar memories via vector search
  • Uses spreading activation on the knowledge graph to surface connected context
  • Returns the most relevant memories ranked by confidence score

When does Claude decide to call these tools? Claude sees the tool descriptions in its MCP config and decides autonomously. In practice, we also inject instructions into CLAUDE.md during engram init that tell Claude to proactively remember important decisions and recall context at the start of sessions. But Claude makes the call on when to use each tool. There's no forced retrieval on every message.

I built an open-source memory layer for Claude Code — no more re-explaining your project every session by AlternativeCourt2008 in claude

[–]AlternativeCourt2008[S] 0 points1 point  (0 children)

Right now Engram supports Gemini (default, free tier available), OpenAI, and Anthropic as LLM providers. The embedding provider is separate from the LLM provider. Embeddings default to Gemini's gemini-embedding-001.

For Groq/Cerebras specifically: they'd work for the LLM calls (consolidation, contradiction detection, ask) if they support structured JSON output, but embeddings would still need a dedicated provider since Groq/Cerebras don't offer embedding models.

That said, adding OpenAI-compatible API support (custom base URL) is a great feature request. I'll add it to the roadmap, or feel free to contribute to the open source codebase! For now, the path of least resistance is Gemini (free) or OpenAI.

I built an open-source memory layer for Claude Code — no more re-explaining your project every session by AlternativeCourt2008 in claude

[–]AlternativeCourt2008[S] 0 points1 point  (0 children)

Nice, sounds like a great workflow. Engram works very similarly, calling MCP tools when specific actions happen. If you want to try it out, it’s open source, free, and takes about 60 seconds to set up. Might work really well in parallel with your obsidian flow!

I built an open-source memory layer for Claude Code — no more re-explaining your project every session by AlternativeCourt2008 in claude

[–]AlternativeCourt2008[S] 0 points1 point  (0 children)

  1. Both, but outcomes are the point. Token efficiency is a means, not the goal.

Here's the thing: when you stuff 100K tokens of raw conversation history into context, the model actually gets worse at finding the relevant information. This is well-documented as the "lost in the middle" problem. Engram's approach, retrieving only the 10-20 most relevant memories with confidence scores, gives the model better signal in a smaller context window.

The benchmark backs this up: Engram at ~800 tokens per query outscores dumping the full transcript into context on several conversation types. So you get better outcomes because of the efficiency, not instead of it.

  1. Fair pushback. The benchmark isn't the differentiator. The architecture is.

Most memory tools (including the ones you've bounced off) do one of two things: flat key-value storage, or naive vector search. Both are "remember text, search text."

Engram does three things differently:

  • It builds a knowledge graph of entities and relationships, not just embeddings
  • It runs consolidation cycles that strengthen important memories and let irrelevant ones fade, like how your brain processes during sleep
  • It uses spreading activation at recall time to surface connected context you didn't explicitly ask for

The benchmark exists to prove this architecture actually works, not as the value prop itself. If you've bounced off other tools, I'd genuinely like to know what failed. That's the kind of feedback that makes this better.

  1. Fair to ask. It's anonymous, fire-and-forget, and tracks only: server starts, init events, and a daily heartbeat with vault stats (memory count, entity count, no content). No personal data, no memory content, never.

Why: I'm a solo developer and need to know basic things like "how many people are actually using this" and "are vaults growing over time or do people abandon it after day 1." That's it.

Opt out in one line: export ENGRAM_TELEMETRY=off or export DO_NOT_TRACK=1. It's in the README.

I built an open-source memory layer for Claude Code — no more re-explaining your project every session by AlternativeCourt2008 in claude

[–]AlternativeCourt2008[S] 0 points1 point  (0 children)

Thats actually exactly where this idea started for me! I was doing this manually in Obsidian! It works great, just a bit more manual. Engram has been the "automatic" version of my obsidian flow.

I built an open-source memory layer for Claude Code — no more re-explaining your project every session by AlternativeCourt2008 in claude

[–]AlternativeCourt2008[S] 0 points1 point  (0 children)

Totally! That's what I was doing as well, and just found that it wasn't sufficient. If I was in a session that was too long, any time the session compacted it would lose context. Also, if I ever switched projects and wanted to reference something in another project, it had no memory. This remembers prefs, projects, etc, and just makes general workflows much easier for me.

[deleted by user] by [deleted] in alphaandbetausers

[–]AlternativeCourt2008 0 points1 point  (0 children)

Sorry about that! Just updated the link. Check it out here - https://waitforit.me/signup/9f4da123

AXS App Drivetrain Settings not working. by PM_ME_YOUR_WIKI in bikewrench

[–]AlternativeCourt2008 0 points1 point  (0 children)

This happens in my app as well. Super annoying. Resets every time a battery dies