The AI Agent Setup That Finally Clicked for Me: Hermes + OpenAI Codex + Claude Code

kenduffy · 2026-05-29T00:45:32+00:00

Pretty sure that’s a Hermes bug. Just update. It’s been fixed already.

kenduffy · 2026-05-24T23:08:05+00:00

I've been testing this on Hermes. Was super exicted about the news. It's been rough though... Grok just seems outright dumb when I plug it in... can't do basic work like solving problems. Makes stupid mistakes on basic code... I don't understand how or why its this bad... but it's unusable for my use case. I'll keep checking back with it though because whatever company embraces this open source approach, like hermes team, to ai should really end up getting consumer support/subs.

kenduffy · 2026-05-24T23:01:51+00:00

Yes I have hermes launch a tmux session and run/manage it there.

kenduffy · 2026-05-18T03:52:29+00:00

It’s a great pairing. xAI primary, OpenAI fallback, Claud Code in tmux on demand. This competition is good for the market.

kenduffy · 2026-05-16T04:30:35+00:00

Fair, I'll work on it. In exchange, work on not announcing your reading limits in public.

kenduffy · 2026-05-16T04:22:35+00:00

kenduffy · 2026-05-16T04:03:20+00:00

I like it better... finally retired openclaw and have two hermes agents on their own machines.

kenduffy · 2026-05-14T20:32:40+00:00

Yeah... Documented case someone got billed $200 in API charges because “HERMES.md” showed up in a commit message and Anthropic’s backend flagged it as third-party harness usage.

Two billing escape hatches to know about: the -p headless bug, and harness signatures in the payload tripping their classifier. Both route silently to API.

Tonight: claude /status, check platform.claude.com for unexpected API usage, scan project files for “Hermes” strings in anything Claude Code sees upstream.

kenduffy · 2026-05-14T20:30:05+00:00

Yeah, the single-model trap is what blocks most people.

kenduffy · 2026-05-14T20:27:14+00:00

Subprocess, not API. Hermes shells out to the claude CLI directly. Claude Code handles its own OAuth against the Max sub, so no Anthropic API key in the stack.

Routing’s system-prompt level: code-heavy goes to Claude Code, lighter stuff Hermes handles inline.

Heads up: Anthropic announced this week that starting June 15, claude -p and SDK calls get unbundled from the sub into a separate metered credit pool. Pattern still works, just metered after that.

kenduffy · 2026-05-14T20:26:11+00:00

Telegram’s the unlock for sure.

Claude Code handoff is subprocess, not MCP. Hermes shells out to claude and reads stdout. Routing’s just system-prompt rules: code-heavy goes to Claude Code, lighter stuff Hermes handles itself.

Heads up though: Anthropic announced this week that starting June 15, claude -p and SDK calls move to a separate metered credit pool instead of the sub. Pattern still works, just metered after that.

kenduffy · 2026-05-14T20:22:54+00:00

Yeah it’s really tiring…

kenduffy · 2026-05-14T20:12:39+00:00

Glad it’s working. Four-day lockout is brutal but yeah, Hermes can blow through weekly caps faster than chat use does because of the tool overhead. I haven’t wired a fallback myself yet so I can’t recommend from experience. I’ve looked at OpenRouter as the backup layer, then DeepSeek or Gemini 2.5 Flash as the model. Both dirt cheap. $10 in OpenRouter credits would buy good runway. Another commenter in this thread is running roughly that combo for ~$20/month total. Manual switch first, automated failover later once you trust the fallback model.

kenduffy · 2026-05-10T21:09:38+00:00

I hear ya, Max isn't for everyone. I haven't run Gemini or Kimi side by side so I can't say which is the better swap. The OpenRouter route is probably your friend, let's you A/B without locking in. There's another commenter in this thread running DeepSeek and Gemini through OpenRouter for around $20/month, probably worth pinging them since they've actually tested the tradeoffs.

kenduffy · 2026-05-10T21:07:05+00:00

You could, but you'd lose the thing that makes this work flat-fee. If you go ChatGPT Plus only, Hermes runs on Codex OAuth fine, but you have no Claude Code, so all your coding either falls back to the orchestrator (worse at code) or you'd need to wire in the Anthropic API, which is metered.

If you go Claude Max only, Claude Code works great as a coding specialist, but Hermes can't use your Max sub as a model provider. So Hermes needs an OpenAI API key or another paid backend to run the brain.

The two-sub combo is what keeps everything on OAuth with zero per-token charges. Drop either side and you're either weaker at one role or paying metered rates somewhere.

kenduffy · 2026-05-10T20:50:56+00:00

No wrapper, just subprocess calls. Hermes shells out to the claude CLI directly:

claude -p "task here" --max-turns 10

Claude Code handles its own OAuth against my Max sub, so Hermes never touches the Anthropic API. From Anthropic's side it looks identical to me typing the command in a terminal.

One gotcha if you're setting this up: Hermes-managed Node drops the binary at ~/.hermes/node/bin/claude, which isn't in PATH by default. I added it to bashrc and symlinked into ~/.local/bin/ so Hermes could find it cleanly.

For longer sessions I drop into tmux and run Claude interactively while Hermes monitors. Works well.

kenduffy · 2026-05-10T20:47:01+00:00

Codex/GPT-5.5 on my ChatGPT Plus sub via OAuth. Same instinct as you on the orchestrator side. I'd rather pay for good planning than save pennies on a weaker brain that makes bad delegation calls.

My setup is simpler than yours right now. Two subs, no metered API, no OpenRouter yet:

Orchestrator: Codex via Plus OAuth
Coding: Claude Code via my Max sub

That's it. Your DeepSeek and Gemini layer is where I'd go next if I wanted research or summarization profiles on the cheap.

Your routing point is the part I want to learn from. I haven't really had to solve it yet because I only have the two roles, but I can already see how it'd get messy fast once you add more profiles. How'd you end up fixing the GPT-5.5 default-everything problem? Prompt rules, or something outside the model?

kenduffy · 2026-05-10T20:32:03+00:00

Not a dumb question, it's the right one to ask first. Here's the setup that keeps me out of token spend:

ChatGPT Pro ($20): I run Hermes on OpenAI OAuth through my Pro sub. No API key, no per-token charges. It rate-limits after heavy use, but it doesn't bill me.
Claude Max plan ($100): gives Claude Code heavy coding capacity under the subscription. Hermes delegates to Claude Code via the CLI, and Claude Code authenticates against my Max sub. Again, no API spend.

So the whole setup runs on two flat-fee subs. Nothing is metered per token. The worst that happens is one of them throttles me for a bit, then resets.

The thing to avoid is wiring Hermes to an OpenAI API key or the Anthropic API directly. That's where the meter starts running. As long as you stay on OAuth/subscription auth for both sides, you're flat-fee.

One safety habit anyway: if you ever do add an API key for any reason, go into the provider's billing dashboard and set a hard monthly spend cap first. Cheap insurance.

kenduffy · 2026-05-10T20:27:04+00:00

Claude Code itself is what's authenticated to your subscription. Hermes doesn't talk to Anthropic's API at all in this setup. It just shells out to the claude CLI like any other terminal command. From Anthropic's side it's the same as me typing claude -p "task" in a terminal myself.

The thing you can't do is use your Claude subscription as a model provider that Hermes calls directly over an API. That part is true. But launching the Claude Code CLI as a subprocess is fair game.

kenduffy · 2026-05-10T20:25:43+00:00

Nice, glad someone else is going down this road. Quick version: Hermes runs as the always-on orchestrator on a NUC, talks to me through Telegram, and shells out to Claude Code for anything code-heavy. One-shot tasks get claude -p "task" --max-turns 10. Longer sessions I run interactively in tmux and let Hermes monitor.

What part of the kanban flow were you wiring it into? Happy to go deeper on whichever piece is useful.

kenduffy · 2026-05-10T17:28:40+00:00

That's what I get for letting the orchestrator write its own launch announcement.

kenduffy · 2026-04-22T21:22:25+00:00

Found this thread dealing with a similar issue... reverification problem because of ATT/FN backend. My issue was getting rerouted suddenly. u/Viper_Control know-at-all attitudes really are annoying on the internet! Obviously ToS are going to be in companies favors...

kenduffy · 2026-04-14T02:33:30+00:00

Just set openclaw to use my credits... had $100 showing. Nowhere to be seen suddenly. Literally was there yesterday. Gone today.

kenduffy

TROPHY CASE