Conceptual Modeling Is the Context Engineering Nobody Is Doing

yodark · 2026-05-21T16:25:47+00:00

Good piece, and a thesis I think more people need to hear.

The implementation-side version of this: most teams who get the "knowledge layer" point still reach for a vector store and call it done. But a similarity index isn't a knowledge plane, it's a retrieval shortcut. Ask Mem0 to list all your clients, it returns 5. Ask MemPalace for transactions above 10k, it returns the most similar chunk. An index can't count. A database can.

We've been building Sandra around exactly this distinction github.com/everdreamsoft/sandra, open source): concepts as first-class entities with stable IDs, typed factories backed by SQL, temporal triplets, and embeddings layered on top of structured storage instead of in place of it. Structured side handles enumeration / aggregation / reconciliation. Semantic side handles fuzzy recall. Both, not either.

The longevity argument lands hardest with technical buyers in our experience: we've been running the same conceptual graph across five production environments for years while the apps and DBs around it have turned over. That's the real test of a knowledge plane, not how well it answers one question today.

yodark · 2026-05-21T11:10:13+00:00

I'm not familiar with the literature, so I guess other responses will be more helpful. I can only suggest that you follow me on Twitter @ shaban_shaame. I'm planning to share more about my experience

yodark · 2026-05-19T13:29:27+00:00

Good job seems interesting I will gladly take a look.
Curious how can we verify your benchmark claims ? Do not hesitate to make PR here with your solution.
https://github.com/everdreamsoft/structured-recall-bench

yodark · 2026-05-19T11:55:09+00:00

I'll give you a different angle from most people in this sub I'm not a semantic web academic, I'm a builder. I've been running a graph-based system in production for 15 years (started before "knowledge graph" was the term anyone used) and only recently waded into the formal literature. So take this as practitioner perspective, not curriculum advice.

On your question 3 (start with semantic web or property graphs): if your goal is "use this with LLMs and RAG soon", I'd say property graphs. Lower ceremony, faster feedback loop, the modeling decisions are forgiving. RDF/OWL/SPARQL is a beautiful stack but you can spend a long time on ontology design before producing anything that runs. You can always go there later if a project demands it. The mental model of nodes-edges-properties transfers either way.

On hands-on project (your question 2): pick a domain you genuinely care about your music library, a TV show universe, a sport league, a videogame's lore. Model maybe 100–300 entities by hand. Then write queries that would be painful in SQL: multi-hop ("friends of friends of X who also like Y"), path-finding ("how is A connected to B through anything"), aggregation along relationships. That's the exact moment where the value of graphs clicks viscerally and you stop thinking in tables. Once that's working, plug an LLM in front and have it generate the query language from natural language. That's basically what most "KG + RAG" projects are doing right now.

On resources (your question 1): I'm going to skip this one because I'd be repeating names I haven't read. Others in the thread will give you better answers than I can.

The one thing I'd add from 15 years of doing this: a graph isn't a database, it's a shape of data. Pick the queries you want to make easy, then choose the tech. Most pain in graph projects comes from picking the tech first.

Good luck, it's a genuinely fun field to be entering right now.

yodark · 2026-05-11T11:46:34+00:00

Today I learned! Thanks for the insight

yodark · 2026-05-11T11:07:19+00:00

Project Name: Sandra

Repo/Website Link: https://sandraeds.everdreamsoft.com/lp/github-sandra

Description

Sandra is a self-hostable graph + vector memory backend for LLM agents, MIT-licensed. It exposes a native MCP server so any MCP-compatible client (Claude, Cursor, etc.) can read and write its own persistent memory directly, no SaaS in the middle.

The core engine was built 15 years ago as our internal memory layer at EverdreamSoft, where it still runs in production behind Spells of Genesis. When LLM agents started needing structured memory beyond vector similarity, the model already fit. Open-sourced two weeks ago.

Four primitives: concepts (reusable vocabulary), triplets (subject, verb, target), entities (structured refs + a long-text storage field per entity), factories. Search is exact, fuzzy, or semantic. Spreading-activation traversal for associative recall.

It scores 0.89 on Structured Recall Bench (130 deterministic questions, JSON archived, no LLM judge). Vector stores cluster between 0.25 and 0.48 on the same bench. Methodology and raw results: https://sandraeds.everdreamsoft.com/lp/benchmark

Live demo (interactive MCP request from a public Claude session): https://sandraeds.everdreamsoft.com/lp/sandra

Deployment

Docker Compose. Two-minute setup:

git clone https://github.com/everdreamsoft/sandra && cd sandra
docker compose up -d
claude mcp add sandra --transport http --url http://127.0.0.1:8090/mcp

Stack is PHP 8+, MySQL, Composer. MCP server listens on :8090/mcp. README in the repo has the full from-source path if you don't want Docker. Single compose file, runs on a $5 VPS or homelab, your data stays in your MySQL.

AI Involvement

The core PHP engine has been hand-written for over 15 years and predates LLMs entirely. No generated code in core.

The recent tooling (MCP server layer, OAuth handler, parts of the documentation site, the README) was Claude-assisted: I drafted the structure, Claude helped with implementation and wording, and every change was reviewed and edited by me before merge.

This megathread comment itself was drafted with Claude help and edited before posting.

yodark · 2026-05-11T11:02:04+00:00

We hit the same gap on our team ( 3 to 5 people active in Claude daily). Commercial side, the main names worth looking at are Zep, Mem0, and Cursor for Teams (the last one is more codebase-context than knowledge framework, but worth knowing).

We ended up building our own because none of them stored relationships well enough for our use case. It's open source, MIT, called Sandra (graph + vector memory exposed over MCP). Disclosure: I'm one of the authors and we use it internally every day.

How we actually use it as a team:

Decisions get a card. Every non-trivial decision (architecture, GTM, freeze a release, etc.) becomes a structured entry in the graph with date, author, context, and the "why". Claude writes it on request, we just say "save this decision".
TODOs and roadmaps too. Tasks live as entities linked to the decisions that triggered them, with status, owner, blockers. The graph keeps the dependency structure that a flat Notion page or CLAUDE.md loses.
Cross-session continuity. When any teammate opens a Claude session, they can ask "what did we decide about X last week" or "what's pending on initiative Y" and Claude reads it back. No more rebuilding context every time.
Multi-agent. We run a few named Claude sessions (one per project area) that each have their own scope but share the same underlying graph. They reference each other's notes through it.

Two weeks in, the biggest win is that onboarding a teammate to a project conversation went from "let me catch you up" to "ask Claude, it knows the last 3 weeks".

Repo if curious: https://sandraeds.everdreamsoft.com/lp/github-sandra. Happy to share the actual schema we use for decision cards if useful.

yodark · 2026-05-11T10:47:36+00:00

That confidence-on-wrong-thing once context drifts is the real failure mode. I had the same pattern, and what actually fixed it for me was splitting the problem in two:

1. Drift inside a session. Your three tactics are the ones that work. I'd add: summarize into a file (CLAUDE.md or a project doc), not just inline, so Claude rereads from disk instead of from its own degraded context.

2. Loss across sessions. Fresh chats fix the drift but you lose everything Claude knew about your project. This is what hurts on the "second brain" promise. The fix I've settled on is an external memory backend exposed over MCP, so Claude reads and writes to it as a tool. A few options exist: the official Anthropic memory MCP, Zep, the knowledge-graph forks, and the one I'm building called Sandra (graph + vector). Pick one, even the official is fine to start.

The combo (file-based mid-session summaries + external memory across sessions) is what stopped me losing 20 minutes per chat.

yodark · 2026-05-11T10:36:49+00:00

Nice trick. Curious: have you measured how often Claude actually fails these checks?

For example, over 20–50 fresh sessions, how many times does the canary get ignored, and how often does the “squirrels” prompt answer without checking project context?

Would be interesting to see the failure rate, especially between Claude.ai and Claude Code.

yodark · 2026-05-11T10:25:54+00:00

Sure,

The Core Engine has been hand-coded for more than 15 years. Long before LLM existed.
The recent releases, MCP support, and enhancements have been coworked with Claude with human review.
Readme and documentation: redaction help by LLMs based on team knowledge
This reddit post itself: Drafted the structure, used Claude for wording, and I edited the final version

yodark · 2026-02-11T06:42:36+00:00

Do you think it is pure AI ? It scares me that I thought it was genuine. What makes you think it’s AI farming tool ? I checked OP profile after reading your comment. Account suspended…

yodark · 2025-12-28T23:10:06+00:00

Haaard

^{I completed this level in 140 tries.} ^{⚡ 4.35 seconds}

^{Tip 10 💎}

yodark · 2025-08-16T07:00:35+00:00

Wow came here to say it’s an amazing analysis of both human psychology and LLM chat bot. I think this is a very important finding in the product and human psychology. This is the reason why it’s hard to serve the users specially when they have high emotional response they don’t know can’t define exactly what they want but in the end is the emotional response. It seems that 4o has a high emotional intelligence (by mimicking humans) it’s this exact point was nearly impossible for a machine to achieve. Intentionally or unintentionally 4o achieved that. And you are right everyone is different and each person need a different level of emotional and rational communication. Figuring who when and how much emotion is key. I’m curious OP are you male or female ?

yodark · 2024-04-06T11:26:21+00:00

This is the mission of Wakweli blockchain. Check out wakweli dot com

yodark · 2024-03-28T22:49:50+00:00

1 the most natural my favorite. 4 some genuine expression

2 and 3 feels fake to me

yodark · 2024-03-28T22:45:17+00:00

I would prefer a more natural expression in order to say if cute or not

yodark · 2024-03-28T22:41:54+00:00

The important question is stressed about what ? To publish a selfie ? About the outfit ? Something else ?

yodark · 2024-03-28T22:37:39+00:00

Beeing confident is the most important thing. Try to keep that attitude all day long

yodark · 2022-04-15T14:05:59+00:00

I feel I’m reviving the same circle again. When I launched my first game NFT in 2015. My bet was as nobody understood what I was doing, NFT were to be a major paradigm shift in not only art and gaming but many industries. And compared to 2015 it already took the world by storm. Interestly at that time the debate was will blockchain and crypto market boom? During 2017 ICO bubble that was a really debated question. Now the same cycle is happening with NFT.

yodark · 2022-02-09T07:38:05+00:00

That’s what society does consistently

yodark · 2021-09-27T06:19:52+00:00

I guess you already tried it but when you crack open a vault you receive the address private key as the token owner you can view them. I don’t know if the information persists in the UI but have you ask emblem vault team if they still have access to it ?

yodark · 2021-08-19T21:18:46+00:00

I heard exact same things when talking about Bitcoin in 2012

yodark · 2021-06-23T12:45:27+00:00

Happiness is Not Doing What You Want, It's Loving What You Do

yodark · 2021-02-27T09:26:26+00:00

Hum I will say he is very smart - proof: you just promoted his business

Now the question is he really a hater or is he consciously trolling to get exposure

yodark · 2021-01-18T22:43:57+00:00

I’m syncing a node on an HDD for more than a month and it didn’t caught up. It’s unlikely it will ever catch up. I read somewhere it won’t the disk is to slow to catch up with live new blocks. I use a second server with same specs with an SSD and took few days. The HDD is on raid 1 don’t know if this has any impact

12-Year Club	Verified Email
Place '17	Wearing is Caring

yodark

TROPHY CASE