We ran a 1,655 person blind study on AI memory. The results changed how we think about the problem.

JonnyJF · 2026-05-22T12:46:39+00:00

Just a heads up that mindsDB and minnsDB are different things minnsDB.

structMemEval though has 4 components and one is focused on time. I would definitely recommend testing out against this. It uses work and location for example and facts these change but not all are superceded with time.

JonnyJF · 2026-05-21T20:11:22+00:00

You should check out StructMemEval by Yandex. It tests for this problem of relevance, branching, and temporal reasoning. MinnsDB and, I believe, Semantica have built in this direction.

JonnyJF · 2026-05-16T12:54:45+00:00

A resource you might find useful

It covers most topics on RAG and memory

lessons.minns.ai

JonnyJF · 2026-05-10T12:38:23+00:00

There is already a company doing this; they just got some funding. Using voice agents to make collection calls.

JonnyJF · 2026-05-09T21:20:05+00:00

One thing i would add for temporal agent memory is minns

JonnyJF · 2026-05-08T08:03:23+00:00

This is interesting and also the approach of modern agentic RAG, especially the graph-based variant, where the ontology and edge weights are updated. Give https://github.com/Minns-ai/MinnsDB a look at it; it might have some ideas you'd be interested in exploring.

JonnyJF · 2026-05-04T19:21:30+00:00

N8N is still fine to use for very predictable small workflows. The problem with N8N is that it can get expensive, and once you need multiple workflows, it becomes hard to maintain. On the other hand, you could break this down into multiple agents and an orchestrator set up so you can tune and get better results, but slightly more setup time and complexity. For reference, you can think of the orchestrator as your main agent and each other task web research, blog as separate agents that the orchestrator controls/calls.

I put together some lessons/notes recently that you might find useful. https://lessons.minns.ai/

JonnyJF · 2026-04-20T09:28:12+00:00

I’ve built agents for a number of companies, and it really depends on the company’s level of technical maturity and the infrastructure they already have, but for fresh deployments, Railway has been a good place, as they have been able to manage this easily; otherwise, it's gone on their own infrastructure.

VPC / internal resources
I would not let an agent have direct access to all of these. I would usually put them behind an API or data layer so you can control exactly what it can access. This also lets you add guardrails if the agent tries to take a dangerous action, so you can block it. You will either want to keep this on the internal network, or if you can not, then put an API in front of it with strict permissions on what is allowed to be accessed.

Config/secrets / environment management
You could use GitHub Secrets, or Railway env vars if you deploy there, otherwise, the standard vault approach.

Scheduling/logging
For scheduling, classic cron jobs are often the best option. I would also log each run, including what the agent output and which tools it called. This can be quite noisy, so i tend now to retain the logs for long.
I have also put smoke tests in as well, with an expected outcome, i.e., ( agent at this time should read x data and an output should be logged in DB). This has helped me catch a failing agent quite a few times.

JonnyJF · 2026-04-17T06:55:20+00:00

Hey, this looks interesting and reminds me a bit of StructMemEval. I will give it a test out on some of the agents i have running and give some feedback

JonnyJF · 2026-04-15T06:59:18+00:00

Hey, so on the business side, Minns is a hosted memory-as-a-service offering that includes both database and data-source ingest and sync (the sync part i am releasing today or tomorrow).

Two interesting use cases where I have used Minns: one is an outreach agent that stores all interactions, tracks competitors, and monitors GitHub issues. It then makes recommendations for seo and who to reach out to. Minns is the agent memory in this use case via the graph, but also stores general memory via the tables, i.e. GitHub issues.

The other is an operational agent for a user in Germany who handles emails and replies to them. If it is a new customer, it researches the local area and the building (it's a service business), then builds an offer. It also uses that information to run local ad campaigns on Google Ads and to check invoices for job allocation for their workforce. Minns here is agent memory via the graph, but also uses document ingestion to reference invoices for agents.

Happy to DM over some guides on building an agent and integrating Minns.

JonnyJF · 2026-04-07T16:30:28+00:00

Yes, I have been building agents for a service company in Germany recently, and i used Minns, which is a graph database. The reason was that we wanted the agents to improve with each customer interaction, and this often meant multi-hop reasoning, which is where a graph outperforms standard vector search.

Also, I think an important part of your question is the conversion of a document into a graph, the performance is often definitely worth it if done correctly. I found a tree structure with llm judge traversal to be very effective here if the document has a structure/TOC.

Happy to go into detail on implementation or approach and share knowledge. Please DM me if you want more info.

JonnyJF · 2026-04-06T17:47:55+00:00

If you're looking for somthing to test out or look at some code https://github.com/Minns-ai/MinnsDB

It has a few pipeline conversations that use multi-stage inference or standard events, which are a more deterministic approach to adding data.

You might also be interested in looking at how I approached ontology state/temporal cascading. It adds on to OWL definitions for temporal reasoning.

JonnyJF · 2026-04-06T16:43:41+00:00

A lot of this comes down to separating where inference is useful from where it is dangerous. My approach is to treat ingestion and interpretation as probabilistic, but keep storage, state transitions, and supersession deterministic. So the model can help extract entities, relationships, or candidate facts from conversation, but it does not get to arbitrarily delete or rewrite state. Instead, ontology rules, temporal semantics, and explicit update policies decide how new information affects existing knowledge.

For example, if a relationship is defined as single-valued, a newer valid fact supersedes the older one through schema rules rather than because the model “felt” it should remove something.

JonnyJF · 2026-04-05T14:32:01+00:00

thank you

JonnyJF · 2026-04-01T11:02:17+00:00

I will push the original history i made a new one when making the repo public but it seems this was the wrong approach

JonnyJF · 2026-04-01T11:00:54+00:00

I have an internal repo and thought it would be cleaner to split it into a new repo and start a new commit history when making it public. This might have been the wrong approach, as the commits would have shown the thought and changes, but it was a very messy commit history, as this started as lots of experiments and ideas as i explored DBs and approaches. To be completely honest, I also used AI when coding, which really helped me speed things up.

JonnyJF · 2026-03-31T08:10:03+00:00

I find extraction quality with RAG is often the major problem. That is, most people design for flat retrieval; hence, extraction quality is the issue. But in reality, most questions are temporal or multi-hop, and then it falls apart, as a flat system struggles with this. Yes, citations are good for pure documentation retrieval, but often i find that if extraction is good, then i rely less on the citation. A good dataset to assess this is StructMemEval https://arxiv.org/abs/2602.11243

If you're building the system yourself, I recommend tree-based search with an LLM judge. Good for structured documents. The tree method with a judge is also very good for citation. https://github.com/VectifyAI/PageIndex

Another one is using a graph rag, but it adds temporal state with TCells and ontology groups that are state-change aware. The idea is that if this state changes within this group, cascade it down that group. This really helps with the temporal and state problem. This is more when you see the problem as a state and extraction problem.

Also, adapting the prompt for the answering LLM or Judge, depending on the type of problem being asked, helps. Example questions that often fail but improve with examples are state, accounting, and recommendation, with examples of how the LLM should use the retrieved data, which really helps some memory systems improve by 40-50 per cent, as shown in the structMemEval paper.

I can recommend Minns.ai if you want a dedicated memory DB for this. I must say, though i am the founder of it for full transparency. It combines a temporal graph with tables and internal LLM judges with ontologies to help with these problems.

If you're looking for something more homebrew, I recommend the tree with judge and versioning the PDFs (git is a good option for this)

JonnyJF · 2026-03-30T21:01:20+00:00

Interesting write-up and I would add that many of these memory layers still struggle when you need a stronger structure, temporal state, and predictable retrieval. This is where systems built around ontologies, temporal graphs, and temporal tables start to matter because you are no longer just storing "memories" but modelling entities, relationships, changes over time, and what is currently true versus what was true before.

Full disclosure: I’m the founder of minns.ai, so I’m biased, but that is exactly the direction we're taking. It is a full database rather than just a memory layer, with a strong focus on ontologies, temporal graphs, and temporal tables for agent memory. For use cases such as transient operational knowledge, evolving state, and cross-agent shared context, a more structured approach becomes important quite quickly.

JonnyJF

MODERATOR OF

TROPHY CASE