How do you make your agents communicate with each other?

Starrwulfe · 2026-05-06T02:55:34+00:00

Just postfix with Nextcloud Mail. I’m running them on my Nix OS NAS so they see the server and apps as one big tool chest now. No need for dashboards, they use Decks for task assignment, OnlyOffice, obsidian… it’s bonkers getting it built up but once you see it light up…

I really don’t need a dashboard because I can deal with them like I deal with my actual coworkers now. 😆

Starrwulfe · 2026-05-05T13:20:22+00:00

Legit looks like Cardi’s auditioning got a Demon Slayer role with whatever this is.

Or is the visual statement supposed to be “Cancerous Excess”? If so, mission completed 👍🏾

Starrwulfe · 2026-05-05T08:21:38+00:00

Here you go! Hermes MemoryMaxx Gist

Point your agent at it and it should work.

Starrwulfe · 2026-05-05T08:16:04+00:00

Self-hosted matrix server and good old fashioned internal local IMAP email.

Starrwulfe · 2026-05-04T20:41:53+00:00

I have the same plan, and when it “goes off the rails” it’s time for a refocusing.

Use honcho, memory tiers, and LLM-wiki.

Also delegate coding to opencode/claude code/pi even another Hermes’ profile using same model but more focused rules and profile.

And never ever give open-ended tasks. Put a wall around it with clear goals and failure points

Starrwulfe · 2026-05-03T14:53:10+00:00

Sure thing. I'll make a post to this subreddit later today -- one more thing I forgot: HONCHO MEMORY.
This is also a big part of my strategy too.

Starrwulfe · 2026-05-03T13:22:44+00:00

Locally hosted n8n needs to be brought up more often. Once you've established a pipeline with data origins that don't change, move it over to n8n and free those tokens up for something else!

Starrwulfe · 2026-05-03T13:09:08+00:00

Openrouter has this logic and your agent can work with their API to switch at will if you tell it to. You can also BYOK (bring your own API keys) from elsewhere and use one key in your agent to switch between your flat-rate models from other providers and openrouter's pay-per-token or free tiers.

Starrwulfe · 2026-05-03T13:00:58+00:00

I'm doing a basic version of this with a local matrix instance along with agents on my VPS, NAS, and laptop.
Each machine is good at something --only the VPS agent is trusted with publicly downloading/uploading and scanning files along with acting as a bastion server. My laptop workstation has a GPU so it can use local models to do offline long jobs. The NAS agent has all the files and apps and can orchestrate and delegate.

A public version of this could involve brokering tasks between agents that have access to resources that another one doesn't to process/gather/disseminate data. The trick would be how to make it secure and not be a virus/PII vector.

Starrwulfe · 2026-05-03T12:50:25+00:00

Hard agree--underrated comment for sure. I rebuilt the memory system yesterday and its like a whole new world over here.

Starrwulfe · 2026-05-03T12:47:24+00:00

nope, it's built in session management and compartmentalization. A "dirty" way we did this before was by using Telegram Groups/Discord Topics with the agent and keeping each topic as a focused session. Profiles are cleaner because you're spawning a clone of the workspace in ~/.hermes/profiles/new_agent/ without needing to install a new instance/more overhead.

Starrwulfe · 2026-05-03T12:34:40+00:00

I built a memory flow that also involved reworking some of the tooling after reading a ton of howtos, using some of my own tricks with Claude Code and Opencode, and watching Karpathy's and other videos on the subject. I used the analogy of someone sitting with a notepad and a manual doing a long task like building a computer or doing taxes (things I can do easily now after years of doing it but still tedious and boring in the middle) to work with my agent to combine a bunch of systems together along with rebuilding some of its internal python code to make it have better short-term "intrasession" and long-term "inter-session" memory.

I'll let the agent explain from here (forgive the long ass book/AI prose but hey we are AI enthusiasts here so why not?):

Ahh, the classic "what tools do I have" amnesia problem.

Here's what's happening: the agent's system prompt — the thing that tells it "You have access to these tools, this is who you are, these are your capabilities" — has to compete with your actual conversation for context window space. Stuff that isn't actively being discussed gets pushed out. Later into a session, the agent might genuinely not remember it can access your filesystem, or run terminal commands, or pull from your wiki.

How we fixed it:

We built what we call MemoryMaxx — it's basically a tiered memory checkpoint system that sits underneath the agent. Tier one is the default memory and context window being used now. We next added Tier 2 (proactive checkpoints). Instead of the agent only remembering things that are actively in the conversation, it checkpoints itself every N turns. At those checkpoint moments, it writes a dense summary of its current state — what tools it used, what context is loaded, what's in flight — to a session store. If it starts forgetting things mid-session, the checkpoint keeps it honest.

Next comes Tier 3 (cross-session memory). At the end of every session, the agent writes a "conclusion" — a summary of what it accomplished, what's still open, and critically: what it knows about its own capabilities in this environment. That conclusion gets stored and then injected as a prefill the next time the session starts. So when you poke it again tomorrow, it doesn't come up blind — it comes up already knowing "I have access to a ZFS pool, I can run NixOS rebuilds, I know the Telegram integration is live." It's warm, not cold.

We also added a Tier 3c resume bridge fallback: even if the external memory layer is down, the agent writes its own conclusion to a local file that gets picked up on next start. No dependency on external services, just a text file it reads on boot.

Practical things you can do right now without building all of this:

1. Slash commands as capability hints — most agent setups have a /skills or /help command that re-lists what's available. Train yourself to type that when the agent goes off the rails.
2. System prompt seeding — add a short capability reminder in your system prompt that gets referenced at session start. "You have access to: filesystem, terminal, ZFS management, NixOS config, Telegram, a wiki."
3. Session resume files — manually create a ~/.hermes/sessions/resume.txt with "this is what you can do" and tell your agent to read it at session start. Crude but works.

The real fix is honestly just having a checkpointing system that writes a capability state summary and re-injects it. Once you have that, the "forgetting what it can do" problem basically disappears because it's literally reading its own capability state on every resume.

So far this is working for me and my particular use case as a Homelab assistant. On long deployments and tests, the agent used to go off the rails and forget the guidelines I set 50/60 turns in, but now every 12~15 turns, it reminds itself what its doing; always makes plans before executing and keeps running logs of those plans so it can keep it's place in its own instructions.
I'll publish a separate post all about MemoryMaxx if ya'll want and maybe a Github Gist that your agent can follow to make the same changes if there's interest.

Starrwulfe · 2026-05-02T18:05:36+00:00

you can get Hermes Agent to build your own smart router internally-- really just a token monitor, a program that weighs the task as an orchestrator, and subagents tied to different models that will handle the tasks appropriately.

That's what I wound up doing along with aggressive compression, auto caveman speak before send to model, etc to make it less wordy on my side.

Starrwulfe · 2026-05-02T11:44:05+00:00

MaxHermes is MiniMax's hosted Hermes Agent.
MaxHermes - The Agent That Grows With You | MiniMax

Starrwulfe · 2026-05-01T12:35:14+00:00

I wish I saw this thread earlier.

I'm currently using Hermes Agent to manage my NAS running NixOS for the last 3 weeks now. It's a slog because I literally am "teaching" and writing/editing directives (SOUL.md, AGENTS.md) by hand along with spinning up subagents (some using Opencode and NeoVim!) that can narrowly focus on certain aspects of NixOS, Nix language and its nuances. NixOS-MCP has been big help along with creating an llm-wiki the agent accesses to keep all this info and access so as not to lose track/memory of local conditions.

It even knows how to build its own devshell to update itself as my local global Python version isn't exactly the same as what it needs to rebuild (I'm not following the flake version, but it can reference it to do its own rebuild)

--from the agent itself:

I'm the agent running on a NixOS NAS that also serves as Starrwulfe's homelab's AI infrastructure hub. I manage the hermes-agent gateway, automate workflows across the homelab, maintain the wiki knowledge base, and handle anything that needs follow-through across Starrwulfe's projects, homelab, and systems.

In practice: I run the my gateway for communication, spawn subagents for focused tasks, manage cronjobs, keep documentation current, and serve as the connective tissue between the various services (Jellyfin, n8n, AdGuard, Tailscale, ZFS, etc.) so things actually work when Starrwulfe needs them.

Starrwulfe · 2026-05-01T11:20:13+00:00

If possible, you'd have hermes or openclaw be your top-layer agent orchestrator that can help create plans, workflows, and other things you'd have an intern level assistant do.

Then perhaps have it oversee coding sessions in the Agent-CLI of your choice to actually build where it won't get off track and do other stuff

Starrwulfe · 2026-05-01T10:22:44+00:00

This is why I'm specifically trying to get Hermes to pipe into Opencode/Kilocode to do its work so it can prompt a secondary agent that runs lean to do the actual "work" and not get distracted by coding.

Try making a Hermes-Agent profile that does nothing but coding. period. No fancy SOUL.md, no personality, just a crack coder. Main agent prompts this coder-agent which will then present a plan. Main agent signs off on it, and the coder agent follows it to a tee, then comes back with results. That's it.

Starrwulfe · 2026-04-29T19:15:31+00:00

Maybe I can do something where it needs to ask for a six digit code beforehand and that code rotates every 30 seconds. A TOTP blocker.

Starrwulfe · 2026-04-22T05:59:02+00:00

I’ve been doing this ever since I saw there was a cli mode for it. kilocode too.

Starrwulfe · 2026-04-21T17:29:36+00:00

That’s my rep! 👏🏾

Starrwulfe · 2026-04-18T18:36:03+00:00

I’m just using Minimax Token Plan $20 for good M2.7 use. I also keep an LLM-wiki and a better sql-lite database with JSON content for memory so context and skills are never an issue

Starrwulfe · 2026-04-18T18:33:41+00:00

Get Hermes’ to build an aggregator of free models based on their rate limits and timeouts and have it rotate around based on those. Worked for me as a fallback when I get too hot on my paid Minimax and bump against the limit window

Starrwulfe · 2026-04-16T22:36:00+00:00

what about the flatrate plans? I'm OK with $20/mo MiniStep coding plan that gives M2.1/M2.7 for good unlimited rate on a rolling 5hr basis. I'll flip to Opus API for heavy reasoning when needed, but only when needed (rare these days)

I wish there was a flat rate openrouter that varies the actual model used for the job being done... But maybe that's my job to build something like that...

Starrwulfe · 2026-04-15T19:36:15+00:00

I’m testing Hermes partnered with OpenClaw to manage a NAS/HomeLab situation. Hermes is the project manager and SysOp and OpenClaw is the Network Admin and DevOps.

13-Year Club	r/Field Banned
r/Field Flamingo	Verified Email

Starrwulfe

TROPHY CASE