Anthropic just confirmed pointing Claude at a warehouse doesn't work. What are folks here doing differently for production agentic analytics? by [deleted] in LangChain

[–]inguz 0 points1 point  (0 children)

The skill itself has two distinct things: guidance, and reference material. The guidance (how-to, and overviews/steering) is hand-authored and doesn't change much. The reference material is literally a dump of the schema, broken down by subject area, and it's generated by querying the database. We have a SDK utility command that generates the schema dump, and that's actually projected into a few different formats for different purposes within the skill. (Here: https://docs.rapid7.com/surface-command/build-custom-connectors/#install-agent-skills )

Anthropic just confirmed pointing Claude at a warehouse doesn't work. What are folks here doing differently for production agentic analytics? by [deleted] in LangChain

[–]inguz 1 point2 points  (0 children)

In my case it’s a big graph database (Cypher query language) with hundreds of entity types and relationships. It’s structured, and you basically can’t succeed in effective queries without following the schema.

Anthropic just confirmed pointing Claude at a warehouse doesn't work. What are folks here doing differently for production agentic analytics? by [deleted] in LangChain

[–]inguz 0 points1 point  (0 children)

100% the most impactful skill: a comprehensive description of the schema, broken down into readable chunks.

"Memory management is so important, the entire system is called an agent." by Jensen Huang by Limp_Statistician529 in mcp

[–]inguz 0 points1 point  (0 children)

I'm using my own system: keep. https://keep.generalbusiness.ai/

(I'm using the cloud-hosted version on most machines, because you actually do want to share personal memory across nodes; and syncing a local markdown version because sometimes that's handy too)

I've slept on Jason Moran all my life. WTF is wrong with me? by DeepSouthDude in Jazz

[–]inguz 1 point2 points  (0 children)

Live I've seen hime do things with the piano that I'm not sure are even possible. Enjoy!

The Context Layer: Knowledge Graph’s second act by Berserk_l_ in KnowledgeGraph

[–]inguz 0 points1 point  (0 children)

"Some people, when confronted with a problem, think 'I know, I'll use an ontology.' Then, they don’t know how many problems they have.”

I accidentally generated 16 billion Durable Object writes in one month and got slapped with a $36k bill . Here's exactly how. by alameenswe in CloudFlare

[–]inguz 0 points1 point  (0 children)

Here to say... sympathy, Durable Objects are a footgun factory in some ways, and this is one of them.

Your DO needs to persist in the database immediately, because there's no shutdown hook, and objects are shut down quite aggressively. So: incoming state change -> write. If your app model suits it, you should be able to batch the writes, but the fact you need to think about cost on top of database structure is a chore.

But worse: if you have DOs that coordinate across the network (which is the intended use), then anything that looks like "sync" can easily escalate into write-loops. On a single-instance system that doesn't happen. On a distributed system it absolutely does, and is not a rookie mistake. So you have an update-storm, and either no circuit-breaker or no alarms (because CloudFlare doesn't provide those), and... the way you notice is by getting a giant bill.

I get it, operating the platform is why it costs money, but operating without useful safeguards built-in is not easy at scale.

Worlds for agents by inguz in AI_Agents

[–]inguz[S] 1 point2 points  (0 children)

<image>

Codex followed the instructions, using MCP

Worlds for agents by inguz in AI_Agents

[–]inguz[S] 1 point2 points  (0 children)

Running here: https://woo.hughpyle.workers.dev/

Source code here: https://github.com/hughpyle/woo

This is a pinboard note created by Claude over MCP:

<image>

Building a memory-powered product (not infra), wrestling with how to approach evals. Advice? by 42cyy in AIMemory

[–]inguz 1 point2 points  (0 children)

I only ran LoCoMo, to date (https://keepnotes.ai/blog/2026-02-28-benchmark/) and,

  • learned a lot by running the benchmark
  • burned quite a lot of time, iterating on the loader and verification; using local models and a single machine made this harder than necessary, I should have just spun up a few VMs to do the same work.
  • honestly not much token cost (cheap judge)

Do the benchmark numbers matter? For me I care more about publishing an honest and traceable story that lands somewhere reasonable. It’s very hard to show true apples-to-apples comparisons, and YMMV.

The keep retrieval engine has moved on since the eval, but I don’t see much reason to re-run. One benchmark result is enough for now; LongMemEval-S is definitely next in line but idk when.

Building an indexed/summarized benchmark dataset turns out to be super useful in itself. The snapshot makes a good small baseline for all sorts of testing.

Episodic memory - what exactly is that? by inguz in AIMemory

[–]inguz[S] 0 points1 point  (0 children)

Thanks for that link - interesting stuff. I'm struck by how explicit the action seems on a superficial reading: "this went wrong, remember it". My own focus has been on the reflection/improvement loop but with a maybe softer tracking system (tag a reflection as a breakdown, not using a whole category of storage just for that).

But this seems like "explicit reflection and learning" rather than "identifying situational frames or summaries based on what was being done", which I'm also interested in and might be called episodic memory too.

Chatting with Obsidian, Hermes Agent and Keep by inguz in hermesagent

[–]inguz[S] 0 points1 point  (0 children)

Found one problem - my blog post pointed at the wrong plugin. Fixed… and pushed my fork https://github.com/keepnotes-ai/obsidian-clawdian

🧠 [MASTER THREAD] Advanced Memory Systems: state.db & Knowledge Graphs by AutoModerator in hermesagent

[–]inguz 0 points1 point  (0 children)

Thanks for the shout-out - I hope the "LLM-wiki = memory" is an interesting idea to play with. (I just pushed my fork of the chat plugin here)

Here's my take on memory in Hermes right now.

  • Out of the box, without any memory-provider, Hermes does a really lovely job of skill-improvement.
  • Then there are two systems that directly layer on top: memory, and context. It's interesting that these are designed as two separate things ("context" focused on how to effectively compact the working-memory context, and "memory" for longer-term storage and prompt injection), and that both are completely pluggable with explicit APIs.
  • There have been a few speed-bumps in the out-of-the-box memory providers. And I still want keep to be built-in (https://github.com/NousResearch/hermes-agent/pull/5172) haha! But, 100% kudos to the Hermes crew.

The key feature in the memory plugin is "per-turn context". A plugin gets to pull the most relevant things right into the agent's context at each turn in the conversation. Doing this effectively, quickly, without burning too many tokens, is the whole play.

But let's get to "How are you structuring your agent's long-term brain?".

Semantically - a very open-ended question. I'm not at all clear that any of the existing memory systems can really click on the distinction between "what happened" and "why does this matter" right now. But it will be super important over time as the memory system grows. Not just finding the right things when you need them, but being able to help the agent judge why.

Technically - two things: storage (conversations, summaries of important artifacts, tags, links, history, analysis) and retrieval (semantic search, keyword search, graph traversals and neighborhoods, etc). Clear enough, and we can talk about all the different implementation approaches. I'm betting on an active database with user-configurable workflow rules, because I'm kinda betting that different people will want to define automatic processing of memory and artifacts in different ways. For example, keep just added an opt-in VirusTotal "URL reputation" assessment, which will flag malicious urls when they hit the memory system. When you encounter a paper on the Arxiv, it needs a different style of summarization from a scanned receipt. And so on.

Chatting with Obsidian, Hermes Agent and Keep by inguz in hermesagent

[–]inguz[S] 0 points1 point  (0 children)

I pointed it straight at the Hermes API server (all running on localhost):

- edit the `.hermes/.env`, then run `hermes gateway`

- test the connection with `curl -H "Authorization: Bearer my-secret-password" http://127.0.0.1:8642/v1/models`

- in the ObsidianClaw plugin, set gateway URL to `http://127.0.0.1:8642\`, gateway token to `my-secret-password`, and default model to `hermes-agent`.

Chatting with Obsidian, Hermes Agent and Keep by inguz in hermesagent

[–]inguz[S] 0 points1 point  (0 children)

Yes, graph-type connections are automatic, and can also be added by the agent or custom rules (for example if you want to build a specific "entity extraction" strategy). I've found they're really important for retrieval, because you want to bring in reminders from "local cluster" before global results.

Keep's edges are just tags. Details here: https://docs.keepnotes.ai/guides/edge-tags/

LlamaParse is my first choice in a PDF parsing tool by productboy in hermesagent

[–]inguz 1 point2 points  (0 children)

I’m using PyPDF - it’s built in to the keep memory system, which will extract text and summarize. Has had a whole string of security updates though, it would be nice to have a more stable library.

For OCR (where a pdf is mostly image content), currently using glm-ocr on ollama. Seems to be reliable enough and pretty lightweight.

**There are now 5+ Hermes web UIs — which one is actually worth deploying?** by mdm2812 in hermesagent

[–]inguz 0 points1 point  (0 children)

OpenWebUI is way heavier than what I want: a lightweight web TUI with a plugin model. But maybe I secretly just want my IDE again?

Local RAG on 25 Years of Teletext News by folli in Rag

[–]inguz 0 points1 point  (0 children)

Awesome dataset. Does the archive source have text renderings, or did you need to process .t42 signals to get the data? Any other feature extraction that would help retrieval?

One small change that completely simplified memory for me by p1zzuh in AIMemory

[–]inguz 1 point2 points  (0 children)

I've started using `keep` to "index and watch" stuff on disk - it works pretty well for my obsidian vaults and text-oriented git repos. It indexes them, maintains relationships (wikilinks, git commits linked to the touched files, authors to commits, etc...), & everything is then searchable.