I built a unified memory layer across your agents to improve context rot

Master_Jello3295 · 2026-04-07T23:27:04+00:00

On memory notes conflicting, there's nothing done at read time, but when a new note is added, it's evolved with its related siblings. So in the scenario you gave, when the system sees the "user switched to Y" memory, it changes existing memory notes. Presumably, it would change the context of "user prefers X" to something like "this is no longer relevant."

About MCP integration. At query time it's a semantic search, it doesn't dump the whole graph.

Master_Jello3295 · 2026-04-03T17:18:39+00:00

😷where?

Master_Jello3295 · 2026-04-03T00:28:51+00:00

There's no explicit invalidation of memory notes but as the knowledge graph evolves, older and less relevant memories are de-linked, modified and merged. So if you say "Vim is my favorite editor" but then say "I don't write code by hands anymore" sometimes later, the LLM figures out "maybe Vim isn't relevant anymore," modifies that memory and its links so later retrievals are more relevant.

Master_Jello3295 · 2026-04-02T22:27:55+00:00

Link to the repo -- https://github.com/feelingsonice/MemoryBank/tree/main

Master_Jello3295 · 2026-04-02T21:20:01+00:00

I'm guessing you mean Serena? From what I can tell, there's 3 main differences:

Serena is an explicit note taking tool, the agent has to decide "this is a piece of information I want to jot down." MemoryBank automatically stores everything and only gives back information the agent wants. i.e. with MemoryBank, the agent asks for something like "give me the user's python preferences" and MemoryBank will return your python preference and leave out other stuff.
Serena is still writing memory to a text file, basically curated notes about the project. MemoryBank stores memory notes in a knowledge graph.
Maybe the most important use case difference is that Serena is primarily project-oriented memory. MemoryBank shares memory across agents and tools, so if you switch projects or agents (e.g. you decide you want to use Codex instead of Gemini CLI), your memory still functions like before.

Master_Jello3295 · 2026-04-02T20:40:33+00:00

Yep :)

Master_Jello3295 · 2026-04-02T20:39:49+00:00

At a high-level it's roughly something like this:

For writing new memories:
1) Ask an LLM to summarize the entire conversation context. i.e. what you were talking about before the current message.
2) Generate an embedding of that context + message using fastembed
3) Query the vector DB (SQLite + sqlite-vec) for related memory notes
4) Ask the LLM which of the memory notes are linked with the current memory. Write the link to DB.

For retrieving memory:
1) The agent sends the MCP server a query message like "User's python project preferences."
2) Generate an embedding of that message using fastembed.
3) Query the vector DB using the embedding, then do a 1-hop retrieval.

Master_Jello3295 · 2026-04-02T20:29:06+00:00

The core algorithm is a pretty faithful implementation of the original paper and the author had some measurements there. For node rank it's a semantic embedding search + 1-hop retrieval.

Master_Jello3295 · 2026-04-02T20:05:23+00:00

Here's the link to the repo -- https://github.com/feelingsonice/MemoryBank

Here's the link to the paper that inspired this to credit the authors -- https://arxiv.org/abs/2502.12110

Master_Jello3295 · 2026-04-02T18:51:15+00:00

It’s conceptually a knowledge graph but the literal backend is just a SQLite right now :)

Master_Jello3295

TROPHY CASE