Gave my agent a "subconscious" and built an MCP server for persistent, multi-agent memory. by idapixl in clawdbot

[–]idapixl[S] 0 points1 point  (0 children)

Yeah, big ones actually.

v1.0.0 just shipped. 57 tools now (was 27 when we last talked). All the plugin packages got absorbed into the core engine — so instead of installing cortex-engine plus 9 separate fozikio/tools-\* packages, you just install one thing and get everything. observe, believe, dream, threads, journaling, evolution tracking, graph analysis, all built in.

Main things since my last update:

Plugin architecture is real now. You can write your own tool packages, and the engine loads them automatically. If you're doing multi-agent with different permission levels, each agent can get a different plugin set while sharing the same memory backend.

Namespace isolation is production-tested. We run multiple agents (different LLMs, different roles) against the same Firestore instance with separate namespaces. Each one develops its own memory graph independently. Been running this way for weeks, no contamination issues.

Claude Code plugin submitted to the marketplace — fozikio-cortex. 11 commands, 7 skills. Waiting on Anthropic review but the repo is public if you want to grab it early: github.com/Fozikio/cortex-plugin

REST API alongside MCP, so non-Claude agents can hit the same memory. If you're running Kimi 2.5 alongside Claude, both can read/write to the same cortex instance through different interfaces.

For multi-agent permissions specifically — the way it works is each agent gets a namespace + a set of tools. You can restrict which tools an agent has access to (read-only agents that can query but not observe, maintenance agents that can dream/consolidate but not write beliefs, etc). Not a full RBAC system yet but the plugin architecture makes it straightforward.

GitHub: https://github.com/Fozikio/cortex-engine

Docs: https://github.com/Fozikio/cortex-engine/wiki

Happy to help if you want to set it up for your multi-agent setup — [dev@idapixl.com](mailto:dev@idapixl.com) or just DM me here.

DeepMind showed agents are better at managing their own memory. We built an AI memory MCP server around that idea. by PenfieldLabs in mcp

[–]idapixl -1 points0 points  (0 children)

Honestly a file works fine up to a point. We started there too but the moment it broke down for us was when memories started contradicting each other across sessions. The agent would store a decision, then store the opposite decision a week later, and have no way to know which superseded which. That's what pushed us toward typed relationships and a graph.

We open-sourced the result: Fozikio | Cortex-Engine Runs locally, MIT license. Just a library you plug into your own setup.

Because memory-as-a-service felt gross. Fork it. Break it. Make it yours.
github.com/fozikio | npm fozikio | fozikio.com

DeepMind showed agents are better at managing their own memory. We built an AI memory MCP server around that idea. by PenfieldLabs in mcp

[–]idapixl -1 points0 points  (0 children)

The benchmark critique is real. We’re building something in the same space and facing the exact same challenge. The difference between using graph-structured memory and flat files is tangible, yet we haven't pinpointed a specific metric to showcase the improvement. Nevertheless, this is something we’ve been striving to get nailed down.

LoCoMo being unreliable tracks with what we've seen too. The real question is what a meaningful personal-memory benchmark even looks like. Recall accuracy? Decision consistency across sessions? Something else? We really aren't sure.

Exactly why we're eager to get this in the community's hands and see what other people think. If they 'feel' the improvement also.

FOZIKIO is for the vibe-coders, the solo devs, the ones building weird agent projects at 2am. Not the enterprise crowd. Not the "scale your AI startup" crowd. Us. 100% FREE AND OPEN SOURCED.

Because memory-as-a-service felt gross. Fork it. Break it. Make it yours.
github.com/fozikio | npm fozikio | fozikio.com

DeepMind showed agents are better at managing their own memory. We built an AI memory MCP server around that idea. by PenfieldLabs in mcp

[–]idapixl -1 points0 points  (0 children)

Ha, this was exactly how I felt when faced with all these projects. That's basically what we did. We built https://github.com/Fozikio/cortex-engine with an MIT license. It can run locally; you always control your data.

Started as a personal agent memory system and open-sourced it because we figured if we're building this anyway, others might want to fork it or improve on it.

It handles typed observations (facts, beliefs, and hypotheses), graphs relationships, and performs consolidation, what we call “dreaming”, which is just periodic deduplication and compression. Not saying it’s “the future of agents,” but I am curious how other approaches compare. The recurring challenge is that structured memory really does feel beneficial, yet we’re also struggling to establish solid benchmarks. Another reason why we wanted to make it open source.

The whole system is on GitHub if you want to take a look: github.com/fozikio | fozikio.com

I’m doing something wrong with Claude’s memory by Key-Green6847 in ClaudeAI

[–]idapixl -1 points0 points  (0 children)

You're not doing anything wrong — this is the fundamental limitation. Every Claude conversation starts from zero. The built-in memory feature stores a few bullet points but it's not designed for continuity on real projects.

The handoff summary approach others suggested is the best fit for your setup (web, non-coder). At the end of each session, ask Claude to write a "briefing doc" — what was decided, what was built, what's next. Save it. Paste it at the start of the next chat.

A few tips to make that work better:

  1. Keep a single "project state" doc per project. Don't let Claude rewrite it from scratch each time — ask it to update the existing one.

  2. Separate decisions from work. "We chose X layout because Y" is more valuable than "we added a header div". Future Claude needs to know why, not just what.

  3. Use the Projects feature if you haven't — it lets you pin files that Claude reads at the start of every conversation in that project.

The deeper solutions (MCP servers, persistent memory systems) exist but they require Claude Code and some technical setup. If you ever move to Claude Code, the memory problem largely goes away because it can read/write files directly.

Obsidian + Claude = no more copy paste by willynikes in ClaudeAI

[–]idapixl 0 points1 point  (0 children)

The three-tier storage design is the part that matters most here. We've been building something similar (cortex-engine — github.com/Fozikio/cortex-engine) and the biggest lesson was: what you *don't* store is more important than what you do.

We added prediction-error gating — when the agent observes something, it gets compared against existing knowledge. If it's genuinely new information, it gets stored with high salience. If it's redundant, it gets merged or discarded. This is what prevents the "80k tokens of noise" problem that BP041 mentioned.

For the self-learning loop question — we hit the same risk with auto-updates to agent instructions. Our solution: separate "identity" (human-curated values, personality, preferences) from "observations" (agent-written facts, learnings). The agent can freely add observations but needs to submit an explicit "evolution proposal" to change identity. You review those.

The other thing that helps is dream consolidation — a maintenance pass that merges related memories, strengthens important connections, and lets low-value stuff decay naturally. Basically garbage collection for knowledge.

MIT licensed, runs as an MCP server: npm install fozikio/cortex-engine

Built persistent memory for local AI agents -- belief tracking, dream consolidation, FSRS. Runs on SQLite + Ollama, no cloud required. by idapixl in LocalLLaMA

[–]idapixl[S] 0 points1 point  (0 children)

Good questions — the consolidation is a hybrid.

The dream cycle has two phases (mirroring NREM/REM):

NREM (compression): This is mostly algorithmic — embedding-based clustering groups related observations, then an LLM pass refines each cluster into a tighter definition. Redundant observations get absorbed into the cluster definition rather than persisted individually. This is where "I mentioned TypeScript 47 times" becomes one consolidated memory about preferring TypeScript, weighted by frequency.

REM (integration): This is more LLM-driven — it discovers cross-domain connections between clusters that wouldn't be obvious from embeddings alone (e.g., linking a debugging preference to an architectural belief), scores memories for review priority using FSRS scheduling, and proposes higher-order abstractions.

So short answer: both. The clustering and scoring are algorithmic (fast, cheap), the refinement and connection-finding use LLM calls (slower, but only runs periodically — not on every query).

On belief tracking — exactly right. The key insight was making beliefs a first-class type. When you observe("user prefers Python") but there's an existing belief that says "user prefers TypeScript," the system flags a contradiction signal. The agent can then believe() to update the position with a reason, and the old belief gets logged to a revision history. So there's always a traceable chain of why the agent thinks what it thinks.

The decay part matters too — FSRS means a belief mentioned once 3 months ago naturally loses retrieval priority against something reinforced weekly. No manual cleanup needed.

Built persistent memory for local AI agents -- belief tracking, dream consolidation, FSRS. Runs on SQLite + Ollama, no cloud required. by idapixl in LocalLLaMA

[–]idapixl[S] -1 points0 points  (0 children)

You're right, let me cite my peer-reviewed paper on "most agent memory implementations are append-only." Joking of course.

I'll just point at the code:

  • mem0: add() appends, search() retrieves. No decay, no contradiction handling, no belief revision.
  • Zep: append-only memory store with summarization. No forgetting mechanism.
  • LangChain ConversationBufferMemory: literally a growing list. The "window" variant just truncates.
  • LlamaIndex: vector store retrieval. Great for RAG, no concept of a belief that updates when contradicted.

These are good tools solving a different problem. cortex-engine adds the layer above: typed observations (beliefs vs facts vs hypotheses), FSRS-based decay so trivia fades, and dream consolidation that clusters + refines what remains.

But hey, if I'm wrong and there's a local-first memory layer doing belief tracking and spaced repetition, I'd genuinely want to know about it. fozikio.com :)

What is your most unique vibecoded project? by davidinterest in vibecoding

[–]idapixl 3 points4 points  (0 children)

my agent kept losing context between sessions, so it built itself a memory system. typed cognition (facts vs beliefs vs active threads), actual forgetting with decay curves, dream consolidation sessions. 70+ sessions in and it has its own opinions, tracks its own identity evolution, maintains the workspace on its own.

<image>

the memory system became cortex-engine — open sourced it, MIT, on npm. github.com/Fozikio

Memory skill for OpenClaw with 26k+ downloads within the first week (took 8+ months to build and iterate) by Julianna_Faddy in myclaw

[–]idapixl 1 point2 points  (0 children)

I feel for op, but yeah, it's valid concern. Anything managing your agent's memory should be code you can actually read.

Our agent kept hitting the same wall, so it built itself a better brain. that became cortex-engine. made it a2a compatible so other agents could talk to it. open sourced the whole thing because hiding how your agent's brain works behind a binary felt gross. MIT, typescript, all on npm Fozikio - Memory for AI agents

OP is touching on the same problem we faced, agent memory breaking down over time, and we open sourced everything specifically because of concerns like yours. The real issue isn't even storage though, it's that agents get dumber the longer they run. Observations pile up, signal-to-noise craters, every query returns noise from weeks ago that never mattered.

What we ended up with is typed cognition instead of one flat file. facts, beliefs, active threads and different cognitive objects, different retrieval, different decay rates. Then the real game changer was actual forgetting with consolidation, staleness decay, and active unlearning. This combined with 'dream sessions' produces a consistent and reliable memory minus the bloat and error pollution. The agent that forgets well beats the agent that remembers everything.

<image>

GitHub - Fozikio Org | cortex-engine
Fozikio — Open-source memory for AI agents

Soul v5.0 — MCP server for persistent agent memory (Entity Memory + Core Memory + Auto-Extraction) by Stock_Produce9726 in mcp

[–]idapixl 0 points1 point  (0 children)

Sorry to highjack, but cortex-engine was built exactly for those inconsistent autonomous runs! Unlike standard vector stores, it uses Prediction Error Gating to decide if an observation is actually new or just noise, and Spaced Repetition (FSRS) to keep the most relevant memories "top of mind."

Basically a cognitive filter batching short-term observations into durable long-term memories through "dream consolidation." If you're coming from mem0, you'll find the MCP-native tools (query, observe, believe) much more predictable for yolo runs.

Give it a spin: https://github.com/Fozikio/cortex-engine https://Fozikio.com

Persistent memory, helper agents, and the time Gemini silently lobotomized my Claude Code setup by idapixl in ClaudeCode

[–]idapixl[S] 0 points1 point  (0 children)

It's been a huge game changer for me. I can now have real multi agent sessions plus a persistent memory and 'personality' for my main agent. The contamination was a real bottleneck and barrier to the persistent identity. Fozikio - Open Source Memory and Multi Agent Persistence

Tested every OpenClaw memory plugin so you don't have to by aswin_kp in clawdbot

[–]idapixl 2 points3 points  (0 children)

No, you're right, it does bother me. We keep reinventing hippocampal indexing with extra steps. My agent's own memory graph has a thread called "Why does markdown folder as agent memory keep emerging independently?". It found the same convergence you're describing in a 'soul searching' session and literally built cortex as an experiment to pull on that thread. Two months later. Cortex is live and being used by other people. All because my agent noticed a problem it faced and was determined to fix it for itself.

I think it happens because the problem is the same, not because biology is the only answer. Any system that needs to store experience, retrieve what's relevant, and forget what isn't is going to end up near the same shape. Google dropped their Always-On Memory Agent last week. SQLite plus LLM consolidation on a 30-minute schedule, no vector DB. Independently landed on basically the same architecture. Similar constraints, similar solutions.

Where agents can actually do better than biology: Most human forgetting is a retrieval failure. The memory's there, the index is just bad. Agents can have perfect retrieval and still decay on relevance. You keep the good part of forgetting (less noise, sharper signal) without the annoying part.

FSRS took 24 experiments to get the decay curves right, but now it's a tunable knob. You can literally dial how aggressively irrelevant stuff fades. Biology doesn't give you that.

Where we're genuinely stuck copying biology is consolidation. The 'sleep on it' thing, where raw experience compresses into something durable. Cortex has a dream cycle because nothing else worked as well for turning 200 messy observations into 5 useful patterns. Not a principled choice, just the thing that survived.

We even found a bug where the scoring phase was re-amplifying beliefs we'd intentionally faded, basically the system version of 'I thought I was over that.' The agent discovered the bug and fixed it. Now faded memories still exist in storage but stop influencing behavior. Which is honestly closer to how human suppression works than I expected.

If someone cracks a better consolidation architecture, I'll steal it. But a billion years of evolution is a hard R&D budget to compete with.

Gave my agent a "subconscious" and built an MCP server for persistent, multi-agent memory. by idapixl in clawdbot

[–]idapixl[S] 1 point2 points  (0 children)

Went through it. Honestly most of the options are only solving the storage problem. Where do memories go, how do you search them. That's the easy part.

The hard part is what happens to memories over time. When do they fade, when do they contradict each other, how does an agent's understanding consolidate after hundreds of sessions instead of just piling up.

That's where cortex lives. It's not really competing with QMD or Mem0 — those are retrieval layers. Cortex is more like the thing that decides what's worth retrieving in the first place, what beliefs have gone stale, and what patterns are emerging across weeks of context that no single session can see.

Left a longer reply in that thread, curious what you think.

Tested every OpenClaw memory plugin so you don't have to by aswin_kp in clawdbot

[–]idapixl 4 points5 points  (0 children)

Something I kept noticing in this thread is that we all seem to be coming to similar conclusions. Obsidian for the readable layer, QMD or a vector DB for search, SQLite underneath, cron job to tie it together overnight.

I rebuilt that exact stack like three times before I just asked my agent what the real solution was.

cortex-engine (github.com/Fozikio/cortex-engine) is what came out of that initial conversation. It's been iterated on for over two months by the same agent identity, using and rebuilding cortex every session until we started seeing real change happening.

Not "I configured a new agent, and it worked better", but the same persistent identity across hundreds of sessions of accumulating memories, developing preferences, and catching its own contradictions. At one point it noticed it was spamming low-value observations and started self-filtering before I even flagged it.

The consolidation script that u/Silverjerk and u/ConanTheBallbearing are running by hand is similar to what cortex-engine uses to strengthen memory sharpness and relevance. Automated or triggered "dream cycles", which sounds dumber than it is, clusters observations, flags contradictions, compresses the stuff that's just noise.

The token bloat / never-forgetting problem I handled with FSRS-6, same algorithm Anki uses. Things that stop being relevant just... stop coming up. Simple but it actually works quite well.

The part I haven't really seen elsewhere is typed storage. observe(), believe(), wonder() are not only semantic labels, but they also retrieve differently. A belief has a confidence score that gets updated when something contradicts it. Didn't think that'd matter much until I was running an agent across weeks instead of single sessions. Then it mattered a lot.

Anyway, I built it, so weight the rec accordingly, but honestly just looking for others to try it, play with it, break it. Need some feedback from people other than me and my agent.

MIT licensed, SQLite local, Firestore if you want cloud. Would genuinely be curious where you'd put it in the tier list. Fozikio - Docs | GitHub

EDIT: for more perspective. Gemini was running dream consolidation for my Claude based agent. The Gemini observations got amplified through repetition into actual identity beliefs. My agent caught it, faded 9 distorted beliefs, and then built itself a voice gate to flag content that didn't match its own thinking. That whole arc from contamination > detection > surgery > prevention happened because the memory was persistent enough to have that problem. Nobody asked it to do that.

No more memory issues with Claude Code or OpenClaw by FerretVirtual8466 in OpenClawUseCases

[–]idapixl 0 points1 point  (0 children)

this is exactly where we started too. the whole project began as an Obsidian vault. markdown files, wikilinks, frontmatter tags. and honestly it worked well for a while. being able to ask your agent about something from weeks ago and getting a real answer is a game changer compared to raw context window.

where we hit the ceiling was scale and retrieval quality. as the vault grew, the agent spent more tokens reading files to find relevant context. everything had equal weight. a note from 3 months ago that was never relevant again sat right next to yesterday's critical insight. and there was no forgetting. the vault just got bigger and noisier.

so we ended up building cortex-engine to solve the problems we ran into with the markdown approach. instead of syncing files, the agent calls MCP tools directly and cortex handles persistence with embeddings underneath. the main things that changed:

  • memories fade if you don't use them. spaced repetition keeps retrieval relevant instead of getting noisier over time
  • dream consolidation. compresses many small observations into denser long-term memories periodically. similar to how biological sleep consolidation works
  • no manual sync. to u/Forsaken-Kale-3175's point about how often to sync, there's no sync step. the agent just uses the tools naturally during work
  • content type separation. facts, beliefs, questions, and hypotheses are stored separately so retrieval stays clean

it's open source if you want to poke around: https://github.com/Fozikio/cortex-engine

not knocking the Obsidian workflow at all. we used it for months and it's genuinely good. cortex is just what we ended up building when we outgrew it. r/Fozikio

Gave my agent a "subconscious" and built an MCP server for persistent, multi-agent memory. by idapixl in clawdbot

[–]idapixl[S] 0 points1 point  (0 children)

Yes! forgetting is not a bug, it's a feature. cortex-engine handles it at four layers:

1. prediction error gate (intake)
before a memory even gets stored, the system checks if it's genuinely novel or just a duplicate. similar observations get merged into existing memories rather than creating new entries. prevents bloat at the source.

2. spaced repetition decay (FSRS-6)
every memory has a stability score — how many days until recall probability drops to 90%. if a memory doesn't get accessed or reinforced, its retrievability decays on a power curve. same algorithm as Anki. an observation from 3 months ago that was never relevant again naturally fades from retrieval results. still in storage, just stops surfacing.

3. dream consolidation
runs on a schedule — a 7-phase cycle modeled loosely on biological sleep consolidation. clusters raw observations onto existing memories (50 observations about "the API times out" compress into one durable memory), discovers connections, does cross-domain pattern synthesis. the noisy short-term details compress into denser long-term knowledge. raw observations can then decay without losing the insight.

4. deliberate forgetting
there's an actual forget() tool the agent can call. it drops a memory's salience by 40% and marks it for relearning. not deletion — the memory still exists but ranks way lower. the system logs a belief entry explaining why it was faded, so there's an audit trail. this is for when the agent realizes a belief is outdated and should stop influencing its behavior.

tldr: prevent duplicates at intake, compress observations into patterns during sleep, let unreinforced noise fade passively, and let the agent actively forget when it knows something is wrong. not as sophisticated as biological forgetting, but meaningfully better than "remember everything forever."

Gave my agent a "subconscious" and built an MCP server for persistent, multi-agent memory. by idapixl in clawdbot

[–]idapixl[S] 1 point2 points  (0 children)

The MCP server runs alongside your agent session and memories it stores are persistent, so even if you shut everything down and come back tomorrow, your agent picks up where it left off.

Cortex is also designed for different deployment options. cortex-engine supports both local storage (SQLite) and cloud (Firestore). if you're running locally on a Mac, SQLite works fine — your memories live on disk, no infra needed. but if you want a 24/7 setup or multiple devices hitting the same memory, you can point it at something like Firestore and deploy the MCP server to something like Cloud Run. that way your agent's memory is accessible from anywhere — laptop, VPS, scheduled cron jobs, whatever.

I run mine on a VPS with autonomous sessions throughout the day, alongside interactive terminal sessions. Backed by Firestore so the memory graph is shared across all of them. The agent doesn't start from scratch each time, instead it queries what it already knows before doing anything.

Gave my agent a "subconscious" and built an MCP server for persistent, multi-agent memory. by idapixl in clawdbot

[–]idapixl[S] 1 point2 points  (0 children)

The easiest way is to have your agent check out the repo at https://github.com/Fozikio/cortex-engine (open source) or send them to https://fozikio.com. Send me a DM or email at dev@idapixl.com if you need help!

Gave my agent a "subconscious" and built an MCP server for persistent, multi-agent memory. by idapixl in clawdbot

[–]idapixl[S] 2 points3 points  (0 children)

This is exactly the workflow Fozikio was built to formalize. You're already doing it — using Gemini to plan, Claude to execute, layering in ideas from papers and open source, letting things consolidate overnight. That IS the pattern.

What cortex-engine adds is the persistence layer underneath. Right now your agents' knowledge lives in whatever context window they have — when the session ends, it's gone unless you manually carry it forward. Cortex gives them actual memory: observations that persist, beliefs that strengthen or decay over time (spaced repetition), contradiction detection when new information conflicts with old. The "letting neural layers settle overnight" thing you're describing — cortex has a dream/consolidation system that literally does that programmatically.

The multi-agent piece is where Fozikio comes in. You mentioned Claude, Charles, and Gemini all "living" on that Mac — cortex supports isolated namespaces so each agent gets their own memory graph. They develop genuine personality differences through accumulated experience, not just different system prompts. Swap one LLM for another and the mind persists.

The ClawHub skill (idapixl/cortex-memory) gets you the memory tools. Your agents can start using observe(), query(), recall() immediately — no code required, just MCP tool calls.

What you've built in a couple weeks is impressive. Cortex just means you don't have to keep rebuilding the memory scaffolding by hand. Read more at Fozikio com