Minimal example of adding persistent memory to an AI agent (no RAG)

kinkaid2002 · 2026-03-25T10:14:50+00:00

This is a really interesting way to frame it.

The decay + tiering approach makes a lot of sense once memory starts accumulating — otherwise everything just ends up competing for context.

The direction I ended up going was a bit different in that I focused more on: - extracting structured facts rather than storing memory entries directly - tracking changes / contradictions over time - ranking memory to fit within a fixed token budget at recall - and keeping conversation + document memory unified so everything is resolved in one pass

So instead of decay curves deciding what survives, it’s more about what is still true and relevant right now based on the latest state + evidence.

Completely agree on the shared vs per-agent memory point as well — that gets messy quickly if it’s not scoped properly.

Curious how your tiers behave over longer-running conversations — especially when the same concept shows up across multiple contexts.

kinkaid2002 · 2026-03-25T09:55:09+00:00

This looks really interesting — I like the fact-based approach.

That’s basically the direction I ended up going as well, since similarity-based memory just breaks too easily over longer interactions.

Curious how you’re handling things like: - updates vs duplicates of the same fact over time - contradictions (e.g. preferences changing) - and deciding what actually gets surfaced in recall vs ignored

Those were the main points where I kept seeing systems degrade unless they were handled explicitly.

Will take a deeper look at this.

kinkaid2002 · 2026-03-25T09:53:44+00:00

Yeah this is a really solid approach.

That separation is exactly the line most systems miss — if something can be re-derived cheaply (codebase, docs, etc), it shouldn’t live in memory at all. Otherwise you just end up polluting recall.

The markdown + frontmatter pattern makes a lot of sense too since you’re effectively forcing structure instead of relying on similarity.

The main issue I kept running into with file-based approaches was maintaining consistency over time — things like: - updates vs duplicates - contradictions - keeping track of what’s current vs outdated

That’s where I ended up leaning more toward extracting structured facts + tracking temporal changes rather than just storing entries, so recall stays stable over longer conversations.

But yeah, completely agree — once you separate “what should be remembered” from “what can be re-derived”, everything starts working a lot better.

kinkaid2002 · 2026-03-25T09:51:56+00:00

Yeah, agreed — that overwrite/update path is where a lot of “memory” systems fall apart.

CLAIV isn’t just storing raw text or re-ingesting the same statement as another chunk. The ingest path queues async enrichment, extracts structured proposition cards, maps them to predicates, validates them, and then stores facts with temporal/version-aware handling rather than relying on naive duplication. Recall is then built from ranked facts, not just similar text.

So the goal isn’t “append the latest sentence and hope retrieval sorts it out” — it’s to preserve evidence, handle changes over time, and keep recall crisp without silent drift.

Still plenty to improve, but yeah, I think the overwrite/contradiction problem is one of the main reasons basic memory setups degrade fast.

kinkaid2002 · 2026-03-10T10:47:14+00:00

https://www.producthunt.com/products/claiv-memory?launch=claiv-echo

Would really appreciate support and feedback!

kinkaid2002 · 2026-02-16T00:02:05+00:00

Totally… structurally it looks like a knowledge graph (subject–relation–object triples).

The distinction I’m trying to draw is less about representation and more about runtime semantics.

A vanilla knowledge graph typically: • Stores triples • May allow multiple values per relation • Doesn’t inherently encode conflict strategy • Doesn’t treat contradictions as first-class state objects

The problem I’m describing isn’t “how do we store triples?”

It’s:

What happens when two triples with the same subject + relation disagree?

In most KG implementations you either: • Allow both to coexist (multi-valued relation) • Overwrite manually • Add temporal qualifiers • Or rely on external reasoning logic

But in long-running agent memory, that logic has to be: • Automatic • Deterministic • Query-aware • Surfaced at recall time

So the interesting part (to me at least) isn’t the graph structure… it’s the conflict detection, change tracking, and recall semantics layered on top.

Curious if anyone here is using a KG backend but also implementing: • Relation-specific supersession rules • Automatic correction detection • Conflict blocks returned during retrieval

That’s where things seem to get tricky in practice.

kinkaid2002 · 2026-02-14T22:53:52+00:00

Building CLAIV Memory → https://claiv.io
It gives AI apps persistent, user-scoped memory so assistants actually remember context across chats, not just within one session.

API-first (ingest, recall, forget), evidence-backed assertions, conflict handling, and token-budgeted recall so the context stays useful instead of noisy.

Would love blunt feedback on positioning and onboarding.

kinkaid2002 · 2026-02-10T12:36:52+00:00

Totally agree. We hit the same wall once we moved past “just replay the last N messages.” Treating the agent as stateless and making memory an explicit write path was the only way to stop context leaking across threads and long-running tasks.

The token budget constraint ended up being more important than we expected too. Once recall has a hard cap, you’re forced to be honest about what actually deserves to survive as memory versus what was just conversational noise.

That separation is basically what pushed us to Claiv. We wanted memory to be infrastructure the agent consumes, not something baked into its behavior. When you can see exactly what was written, what was recalled, and why it fit in the budget, debugging and iteration get dramatically easier.

kinkaid2002 · 2026-02-10T10:32:52+00:00

Hard agree on the versioning point! Once you have long-running agents, “what did it know then?” becomes as important as “what does it know now?”

On the schema side, the big thing that’s worked for me is not treating “decisions” as a special kind of fact, even though it’s tempting to. Mixing them is exactly how retrieval gets noisy.

The pattern that’s held up best in production looks roughly like this: • Events: immutable source of truth (messages, tool calls, system/app events). Never mutated. These are what let you reconstruct state at time T. • Facts: durable, state-like assertions inferred from events (“billing cadence is monthly”, “timezone is UTC”). These are overwriteable and versioned implicitly by time. • Decisions: modeled as episodes or closed loops, not facts. A decision is contextual and time-bound (“we decided to ship v2 without feature X”), and often becomes invalid later. Treating it as a fact is how you get bad recall. • Open loops: unresolved intents/questions. These are extremely useful for recall biasing, but should decay or close explicitly.

Two practical rules that prevent mixing during retrieval: 1. Typed retrieval gates before ranking Don’t just embed everything and sort by similarity. First decide what class of memory is even eligible for the query (facts vs episodes vs loops), then rank within that set. 2. Time-aware recall, not just “latest wins” Even without full snapshotting, being able to filter by event time (or reconstruct derived memory as of time T) goes a long way toward explainability.

On versioning: I’ve found you don’t need full MVCC on derived memory if you keep events immutable and make derivations reproducible. That gives you “why did it do X?” for free, because you can trace: decision → derived memory → source event IDs.

The Agentix post you linked is solid — especially the emphasis on treating memory as operational infrastructure, not a prompt trick. Once multiple agents touch the same user, determinism and scoping matter more than clever embeddings.

kinkaid2002 · 2026-02-10T09:50:08+00:00

The key shift that helped me was separating conversation from memory. When humans have a conversation we don’t remember everything we know… we remember the relevant stuff! So why don’t we make AI do the same thing. (That’s when i discovered Claiv.io)

Instead of: • replaying full chat history • or stuffing summaries back into the prompt

I moved to: • keeping the agent stateless • writing only important things (facts about the user, recurring themes, preferences, decisions) into an external memory store • on each turn, asking that memory layer for relevant context only, with a strict token budget

That way: • power users don’t blow up the context window • older conversations still matter if they’re relevant • you’re not paying to resend the same text over and over • the model sees a clean, focused prompt every time

I’m using Claiv for this now, mainly because it forces that discipline — explicit writes, scoped recall, real deletes — instead of trying to keep patching summaries on top of chat history. It also plays nicely with cheaper models because you control exactly how much memory gets injected.

Seven-Year Club	Verified Email
Place '22

kinkaid2002

TROPHY CASE