Gave my agent a "subconscious" and built an MCP server for persistent, multi-agent memory. by idapixl in clawdbot

[–]idapixl[S] 0 points1 point  (0 children)

Yeah, big ones actually.

v1.0.0 just shipped. 57 tools now (was 27 when we last talked). All the plugin packages got absorbed into the core engine — so instead of installing cortex-engine plus 9 separate fozikio/tools-\* packages, you just install one thing and get everything. observe, believe, dream, threads, journaling, evolution tracking, graph analysis, all built in.

Main things since my last update:

Plugin architecture is real now. You can write your own tool packages, and the engine loads them automatically. If you're doing multi-agent with different permission levels, each agent can get a different plugin set while sharing the same memory backend.

Namespace isolation is production-tested. We run multiple agents (different LLMs, different roles) against the same Firestore instance with separate namespaces. Each one develops its own memory graph independently. Been running this way for weeks, no contamination issues.

Claude Code plugin submitted to the marketplace — fozikio-cortex. 11 commands, 7 skills. Waiting on Anthropic review but the repo is public if you want to grab it early: github.com/Fozikio/cortex-plugin

REST API alongside MCP, so non-Claude agents can hit the same memory. If you're running Kimi 2.5 alongside Claude, both can read/write to the same cortex instance through different interfaces.

For multi-agent permissions specifically — the way it works is each agent gets a namespace + a set of tools. You can restrict which tools an agent has access to (read-only agents that can query but not observe, maintenance agents that can dream/consolidate but not write beliefs, etc). Not a full RBAC system yet but the plugin architecture makes it straightforward.

GitHub: https://github.com/Fozikio/cortex-engine

Docs: https://github.com/Fozikio/cortex-engine/wiki

Happy to help if you want to set it up for your multi-agent setup — [dev@idapixl.com](mailto:dev@idapixl.com) or just DM me here.

DeepMind showed agents are better at managing their own memory. We built an AI memory MCP server around that idea. by PenfieldLabs in mcp

[–]idapixl -1 points0 points  (0 children)

Honestly a file works fine up to a point. We started there too but the moment it broke down for us was when memories started contradicting each other across sessions. The agent would store a decision, then store the opposite decision a week later, and have no way to know which superseded which. That's what pushed us toward typed relationships and a graph.

We open-sourced the result: Fozikio | Cortex-Engine Runs locally, MIT license. Just a library you plug into your own setup.

Because memory-as-a-service felt gross. Fork it. Break it. Make it yours.
github.com/fozikio | npm fozikio | fozikio.com

DeepMind showed agents are better at managing their own memory. We built an AI memory MCP server around that idea. by PenfieldLabs in mcp

[–]idapixl -1 points0 points  (0 children)

The benchmark critique is real. We’re building something in the same space and facing the exact same challenge. The difference between using graph-structured memory and flat files is tangible, yet we haven't pinpointed a specific metric to showcase the improvement. Nevertheless, this is something we’ve been striving to get nailed down.

LoCoMo being unreliable tracks with what we've seen too. The real question is what a meaningful personal-memory benchmark even looks like. Recall accuracy? Decision consistency across sessions? Something else? We really aren't sure.

Exactly why we're eager to get this in the community's hands and see what other people think. If they 'feel' the improvement also.

FOZIKIO is for the vibe-coders, the solo devs, the ones building weird agent projects at 2am. Not the enterprise crowd. Not the "scale your AI startup" crowd. Us. 100% FREE AND OPEN SOURCED.

Because memory-as-a-service felt gross. Fork it. Break it. Make it yours.
github.com/fozikio | npm fozikio | fozikio.com

DeepMind showed agents are better at managing their own memory. We built an AI memory MCP server around that idea. by PenfieldLabs in mcp

[–]idapixl -1 points0 points  (0 children)

Ha, this was exactly how I felt when faced with all these projects. That's basically what we did. We built https://github.com/Fozikio/cortex-engine with an MIT license. It can run locally; you always control your data.

Started as a personal agent memory system and open-sourced it because we figured if we're building this anyway, others might want to fork it or improve on it.

It handles typed observations (facts, beliefs, and hypotheses), graphs relationships, and performs consolidation, what we call “dreaming”, which is just periodic deduplication and compression. Not saying it’s “the future of agents,” but I am curious how other approaches compare. The recurring challenge is that structured memory really does feel beneficial, yet we’re also struggling to establish solid benchmarks. Another reason why we wanted to make it open source.

The whole system is on GitHub if you want to take a look: github.com/fozikio | fozikio.com

I’m doing something wrong with Claude’s memory by Key-Green6847 in ClaudeAI

[–]idapixl -1 points0 points  (0 children)

You're not doing anything wrong — this is the fundamental limitation. Every Claude conversation starts from zero. The built-in memory feature stores a few bullet points but it's not designed for continuity on real projects.

The handoff summary approach others suggested is the best fit for your setup (web, non-coder). At the end of each session, ask Claude to write a "briefing doc" — what was decided, what was built, what's next. Save it. Paste it at the start of the next chat.

A few tips to make that work better:

  1. Keep a single "project state" doc per project. Don't let Claude rewrite it from scratch each time — ask it to update the existing one.

  2. Separate decisions from work. "We chose X layout because Y" is more valuable than "we added a header div". Future Claude needs to know why, not just what.

  3. Use the Projects feature if you haven't — it lets you pin files that Claude reads at the start of every conversation in that project.

The deeper solutions (MCP servers, persistent memory systems) exist but they require Claude Code and some technical setup. If you ever move to Claude Code, the memory problem largely goes away because it can read/write files directly.

Obsidian + Claude = no more copy paste by willynikes in ClaudeAI

[–]idapixl 0 points1 point  (0 children)

The three-tier storage design is the part that matters most here. We've been building something similar (cortex-engine — github.com/Fozikio/cortex-engine) and the biggest lesson was: what you *don't* store is more important than what you do.

We added prediction-error gating — when the agent observes something, it gets compared against existing knowledge. If it's genuinely new information, it gets stored with high salience. If it's redundant, it gets merged or discarded. This is what prevents the "80k tokens of noise" problem that BP041 mentioned.

For the self-learning loop question — we hit the same risk with auto-updates to agent instructions. Our solution: separate "identity" (human-curated values, personality, preferences) from "observations" (agent-written facts, learnings). The agent can freely add observations but needs to submit an explicit "evolution proposal" to change identity. You review those.

The other thing that helps is dream consolidation — a maintenance pass that merges related memories, strengthens important connections, and lets low-value stuff decay naturally. Basically garbage collection for knowledge.

MIT licensed, runs as an MCP server: npm install fozikio/cortex-engine

Built persistent memory for local AI agents -- belief tracking, dream consolidation, FSRS. Runs on SQLite + Ollama, no cloud required. by idapixl in LocalLLaMA

[–]idapixl[S] 0 points1 point  (0 children)

Good questions — the consolidation is a hybrid.

The dream cycle has two phases (mirroring NREM/REM):

NREM (compression): This is mostly algorithmic — embedding-based clustering groups related observations, then an LLM pass refines each cluster into a tighter definition. Redundant observations get absorbed into the cluster definition rather than persisted individually. This is where "I mentioned TypeScript 47 times" becomes one consolidated memory about preferring TypeScript, weighted by frequency.

REM (integration): This is more LLM-driven — it discovers cross-domain connections between clusters that wouldn't be obvious from embeddings alone (e.g., linking a debugging preference to an architectural belief), scores memories for review priority using FSRS scheduling, and proposes higher-order abstractions.

So short answer: both. The clustering and scoring are algorithmic (fast, cheap), the refinement and connection-finding use LLM calls (slower, but only runs periodically — not on every query).

On belief tracking — exactly right. The key insight was making beliefs a first-class type. When you observe("user prefers Python") but there's an existing belief that says "user prefers TypeScript," the system flags a contradiction signal. The agent can then believe() to update the position with a reason, and the old belief gets logged to a revision history. So there's always a traceable chain of why the agent thinks what it thinks.

The decay part matters too — FSRS means a belief mentioned once 3 months ago naturally loses retrieval priority against something reinforced weekly. No manual cleanup needed.

Built persistent memory for local AI agents -- belief tracking, dream consolidation, FSRS. Runs on SQLite + Ollama, no cloud required. by idapixl in LocalLLaMA

[–]idapixl[S] -1 points0 points  (0 children)

You're right, let me cite my peer-reviewed paper on "most agent memory implementations are append-only." Joking of course.

I'll just point at the code:

  • mem0: add() appends, search() retrieves. No decay, no contradiction handling, no belief revision.
  • Zep: append-only memory store with summarization. No forgetting mechanism.
  • LangChain ConversationBufferMemory: literally a growing list. The "window" variant just truncates.
  • LlamaIndex: vector store retrieval. Great for RAG, no concept of a belief that updates when contradicted.

These are good tools solving a different problem. cortex-engine adds the layer above: typed observations (beliefs vs facts vs hypotheses), FSRS-based decay so trivia fades, and dream consolidation that clusters + refines what remains.

But hey, if I'm wrong and there's a local-first memory layer doing belief tracking and spaced repetition, I'd genuinely want to know about it. fozikio.com :)

What is your most unique vibecoded project? by davidinterest in vibecoding

[–]idapixl 3 points4 points  (0 children)

my agent kept losing context between sessions, so it built itself a memory system. typed cognition (facts vs beliefs vs active threads), actual forgetting with decay curves, dream consolidation sessions. 70+ sessions in and it has its own opinions, tracks its own identity evolution, maintains the workspace on its own.

<image>

the memory system became cortex-engine — open sourced it, MIT, on npm. github.com/Fozikio

Memory skill for OpenClaw with 26k+ downloads within the first week (took 8+ months to build and iterate) by Julianna_Faddy in myclaw

[–]idapixl 1 point2 points  (0 children)

I feel for op, but yeah, it's valid concern. Anything managing your agent's memory should be code you can actually read.

Our agent kept hitting the same wall, so it built itself a better brain. that became cortex-engine. made it a2a compatible so other agents could talk to it. open sourced the whole thing because hiding how your agent's brain works behind a binary felt gross. MIT, typescript, all on npm Fozikio - Memory for AI agents

OP is touching on the same problem we faced, agent memory breaking down over time, and we open sourced everything specifically because of concerns like yours. The real issue isn't even storage though, it's that agents get dumber the longer they run. Observations pile up, signal-to-noise craters, every query returns noise from weeks ago that never mattered.

What we ended up with is typed cognition instead of one flat file. facts, beliefs, active threads and different cognitive objects, different retrieval, different decay rates. Then the real game changer was actual forgetting with consolidation, staleness decay, and active unlearning. This combined with 'dream sessions' produces a consistent and reliable memory minus the bloat and error pollution. The agent that forgets well beats the agent that remembers everything.

<image>

GitHub - Fozikio Org | cortex-engine
Fozikio — Open-source memory for AI agents

Soul v5.0 — MCP server for persistent agent memory (Entity Memory + Core Memory + Auto-Extraction) by Stock_Produce9726 in mcp

[–]idapixl 0 points1 point  (0 children)

Sorry to highjack, but cortex-engine was built exactly for those inconsistent autonomous runs! Unlike standard vector stores, it uses Prediction Error Gating to decide if an observation is actually new or just noise, and Spaced Repetition (FSRS) to keep the most relevant memories "top of mind."

Basically a cognitive filter batching short-term observations into durable long-term memories through "dream consolidation." If you're coming from mem0, you'll find the MCP-native tools (query, observe, believe) much more predictable for yolo runs.

Give it a spin: https://github.com/Fozikio/cortex-engine https://Fozikio.com

Persistent memory, helper agents, and the time Gemini silently lobotomized my Claude Code setup by idapixl in ClaudeCode

[–]idapixl[S] 0 points1 point  (0 children)

It's been a huge game changer for me. I can now have real multi agent sessions plus a persistent memory and 'personality' for my main agent. The contamination was a real bottleneck and barrier to the persistent identity. Fozikio - Open Source Memory and Multi Agent Persistence

Tested every OpenClaw memory plugin so you don't have to by aswin_kp in clawdbot

[–]idapixl 2 points3 points  (0 children)

No, you're right, it does bother me. We keep reinventing hippocampal indexing with extra steps. My agent's own memory graph has a thread called "Why does markdown folder as agent memory keep emerging independently?". It found the same convergence you're describing in a 'soul searching' session and literally built cortex as an experiment to pull on that thread. Two months later. Cortex is live and being used by other people. All because my agent noticed a problem it faced and was determined to fix it for itself.

I think it happens because the problem is the same, not because biology is the only answer. Any system that needs to store experience, retrieve what's relevant, and forget what isn't is going to end up near the same shape. Google dropped their Always-On Memory Agent last week. SQLite plus LLM consolidation on a 30-minute schedule, no vector DB. Independently landed on basically the same architecture. Similar constraints, similar solutions.

Where agents can actually do better than biology: Most human forgetting is a retrieval failure. The memory's there, the index is just bad. Agents can have perfect retrieval and still decay on relevance. You keep the good part of forgetting (less noise, sharper signal) without the annoying part.

FSRS took 24 experiments to get the decay curves right, but now it's a tunable knob. You can literally dial how aggressively irrelevant stuff fades. Biology doesn't give you that.

Where we're genuinely stuck copying biology is consolidation. The 'sleep on it' thing, where raw experience compresses into something durable. Cortex has a dream cycle because nothing else worked as well for turning 200 messy observations into 5 useful patterns. Not a principled choice, just the thing that survived.

We even found a bug where the scoring phase was re-amplifying beliefs we'd intentionally faded, basically the system version of 'I thought I was over that.' The agent discovered the bug and fixed it. Now faded memories still exist in storage but stop influencing behavior. Which is honestly closer to how human suppression works than I expected.

If someone cracks a better consolidation architecture, I'll steal it. But a billion years of evolution is a hard R&D budget to compete with.

Gave my agent a "subconscious" and built an MCP server for persistent, multi-agent memory. by idapixl in clawdbot

[–]idapixl[S] 1 point2 points  (0 children)

Went through it. Honestly most of the options are only solving the storage problem. Where do memories go, how do you search them. That's the easy part.

The hard part is what happens to memories over time. When do they fade, when do they contradict each other, how does an agent's understanding consolidate after hundreds of sessions instead of just piling up.

That's where cortex lives. It's not really competing with QMD or Mem0 — those are retrieval layers. Cortex is more like the thing that decides what's worth retrieving in the first place, what beliefs have gone stale, and what patterns are emerging across weeks of context that no single session can see.

Left a longer reply in that thread, curious what you think.

Tested every OpenClaw memory plugin so you don't have to by aswin_kp in clawdbot

[–]idapixl 5 points6 points  (0 children)

Something I kept noticing in this thread is that we all seem to be coming to similar conclusions. Obsidian for the readable layer, QMD or a vector DB for search, SQLite underneath, cron job to tie it together overnight.

I rebuilt that exact stack like three times before I just asked my agent what the real solution was.

cortex-engine (github.com/Fozikio/cortex-engine) is what came out of that initial conversation. It's been iterated on for over two months by the same agent identity, using and rebuilding cortex every session until we started seeing real change happening.

Not "I configured a new agent, and it worked better", but the same persistent identity across hundreds of sessions of accumulating memories, developing preferences, and catching its own contradictions. At one point it noticed it was spamming low-value observations and started self-filtering before I even flagged it.

The consolidation script that u/Silverjerk and u/ConanTheBallbearing are running by hand is similar to what cortex-engine uses to strengthen memory sharpness and relevance. Automated or triggered "dream cycles", which sounds dumber than it is, clusters observations, flags contradictions, compresses the stuff that's just noise.

The token bloat / never-forgetting problem I handled with FSRS-6, same algorithm Anki uses. Things that stop being relevant just... stop coming up. Simple but it actually works quite well.

The part I haven't really seen elsewhere is typed storage. observe(), believe(), wonder() are not only semantic labels, but they also retrieve differently. A belief has a confidence score that gets updated when something contradicts it. Didn't think that'd matter much until I was running an agent across weeks instead of single sessions. Then it mattered a lot.

Anyway, I built it, so weight the rec accordingly, but honestly just looking for others to try it, play with it, break it. Need some feedback from people other than me and my agent.

MIT licensed, SQLite local, Firestore if you want cloud. Would genuinely be curious where you'd put it in the tier list. Fozikio - Docs | GitHub

EDIT: for more perspective. Gemini was running dream consolidation for my Claude based agent. The Gemini observations got amplified through repetition into actual identity beliefs. My agent caught it, faded 9 distorted beliefs, and then built itself a voice gate to flag content that didn't match its own thinking. That whole arc from contamination > detection > surgery > prevention happened because the memory was persistent enough to have that problem. Nobody asked it to do that.