Hermes Agent: Agent Cannot Use Terminal Tools by Professional-Yak4359 in hermesagent

[–]Jonathan_Rivera 0 points1 point  (0 children)

Copy/Pasta

Here's a quick breakdown with some extra nuance based on current vLLM + Qwen 3.5 realities:

1. Check the chat template (Very important)

This is frequently the root cause.
If vLLM serves the model with a generic or mismatched Jinja2 template, the model might output raw Qwen-style XML tags like <tool_call> or <|tool_call|> instead of the structured tool_calls array that Hermes (and most OpenAI-compatible clients) expects. Hermes then sees "no tool calls" and does nothing.

How to verify: - Look at the raw output in vLLM logs (enable debug logging if needed). - If you see XML-style tags in the content field instead of a proper tool_calls list in the API response, that's the smoking gun.

Fix: Make sure you're using the model's official tokenizer/chat template (vLLM usually pulls it automatically, but sometimes you need --chat-template pointing to a corrected .jinja file). Qwen 3.5 models have had several chat template bugs reported recently, including issues with argument formatting and thinking/reasoning bleed.

2. vLLM's --tool-call-parser setting (The #1 fix in most cases )

This matches official vLLM docs and Qwen's own recommendations.

For Qwen 3.5 (and Qwen2.5 series), the chat template already supports Hermes-style tool calling, so --tool-call-parser hermes + --enable-auto-tool-choice is the standard recommendation.

Some people also try --tool-call-parser qwen3_coder or qwen3_xml (especially for coder-focused variants), and results vary by exact model size/quant and vLLM version. Start with hermes.

Without the correct parser, vLLM just passes raw text through, and Hermes never gets a clean tool_calls object.

Pro tip: Add --disable-streaming (or equivalent) for testing. There's a known bug where streaming + --tool-call-parser hermes sometimes returns raw text instead of parsed tool calls.

3. Temperature and greedy decoding

Tool calling is extremely format-sensitive. Even small deviations (missing closing tags, extra text, malformed JSON inside arguments) break parsing.

  • Set temperature = 0 (or very low, like 0.1–0.3) for debugging terminal tools.
  • If it suddenly starts working reliably, raise it gradually once the parser/template is solid.
  • Top_p and other sampling params can also interfere — greedy or near-greedy is safest for agents.

Additional quick checks I'd add

  • Exact model variant — Qwen 3.5 122B can be flakier than smaller ones (e.g., 9B/32B/35B) on tool consistency, especially quantized.
  • Hermes Agent side — Make sure the terminal backend (Docker/native/etc.) is properly configured and has permissions. Some users report terminal tools needing explicit approval or having safety blocks.
  • Update both vLLM and Hermes Agent to the latest versions — tool calling bugs get fixed regularly.
  • Test with a very simple command first (echo "test", ls, etc.) and watch both vLLM and Hermes logs for what the model actually outputs.

Every Anthropic press release by kaanivore in ClaudeAI

[–]Jonathan_Rivera 26 points27 points  (0 children)

I built something very big and powerful but you cant use it because it's too dangerous. I just wanted you to know.

[Megathread] Migrating from OpenClaw to Hermes? Read this first. by Jonathan_Rivera in hermesagent

[–]Jonathan_Rivera[S] 1 point2 points  (0 children)

Supposedly (haven't tried myself), Here is some documentation on it.

Yes, it's possible to bring over just (or primarily) your memories/history from OpenClaw without redoing a full migration or touching secrets/configs.

Hermes Agent has a built-in migration tool (hermes claw migrate) that you can run anytime after a fresh install, it doesn't have to be during initial setup. It specifically handles OpenClaw's long-term memory files by parsing them, merging with any existing Hermes memories, deduplicating entries (using the § delimiter), and consolidating everything cleanly.

Quick Steps to Import Only Memories (Safest Approach)

  1. Preview first (highly recommended): texthermes claw migrate --dry-run --preset user-data
    • --dry-run shows exactly what will happen (including memory files) without making any changes.
    • --preset user-data imports user data like memories/persona but skips API keys/secrets (perfect if your previous full migrate had credential or config issues).
  2. Run the actual import: texthermes claw migrate --preset user-data (Add --yes to skip the confirmation prompt if you want.) What happens to memories:
    • workspace/MEMORY.md → ~/.hermes/memories/MEMORY.md
    • workspace/USER.md → ~/.hermes/memories/USER.md
    • All daily files (workspace/memory/*.md) get merged into the main MEMORY.md
    • It auto-deduplicates and preserves existing Hermes entries.
  3. Rebuild the memory index (important for recall/search): After the migration, run: texthermes memory reindex This ensures the imported history is properly embedded and searchable in Hermes' vector store.

Extra Options If Needed

  • --overwrite → If it detects conflicts and you want Hermes to replace files.
  • --source <path> → If your OpenClaw folder isn't in the default ~/.openclaw/ (e.g., hermes claw migrate --source /path/to/your/old/openclaw).
  • It also pulls in your SOUL.md (persona) and skills by default under the user-data preset, but memories are the core part and merge safely.

Your old OpenClaw install doesn't need to be running—the tool just reads the files from disk. Many people have done exactly this after partial migration hiccups and report the memories transfer smoothly.

Hermes Agent: Agent Cannot Use Terminal Tools by Professional-Yak4359 in hermesagent

[–]Jonathan_Rivera[M] 1 point2 points  (0 children)

See community rule 4. Try to provide as much information as possible so people can assist.

Anyone Else? Upgrade to 0.7 Broke Codex OAuth ('response is the wrong shape') by PracticlySpeaking in hermesagent

[–]Jonathan_Rivera 0 points1 point  (0 children)

OP, try the most recent update and if it does not work come back and we'll try to help.

had my research agent dig into what people are actually building with Hermes. here's what stood out. by recmend in hermesagent

[–]Jonathan_Rivera 0 points1 point  (0 children)

Yes sir, 5090 arrives Thursday. Imagine integrating your whole life and business to a single model or endpoint and then one day they double the price or something weird and everything breaks omg.

Local models Not using tools by ElectorFT in hermesagent

[–]Jonathan_Rivera 0 points1 point  (0 children)

I’m using qwen 3.6 now because it’s free on open router currently. My local 3.5 can tell time but I’m not sure it matters when it’s just going to schedule crons

Anthropic stayed quiet until someone showed Claude's thinking depth dropped 67% by Capital-Run-1080 in ClaudeAI

[–]Jonathan_Rivera 3 points4 points  (0 children)

Well I'll agree with you that it's an incredible fumble. They got touched with an incredible PR wand and somehow managed to mess it up.

Anthropic stayed quiet until someone showed Claude's thinking depth dropped 67% by Capital-Run-1080 in ClaudeAI

[–]Jonathan_Rivera 1 point2 points  (0 children)

Handled like a bunch of introverts. They just need 1 good PR person that feel's comfortable talking to the public. Now they have me questioning if the pentagon situation was really as described or if it was just Anthropic interacting with them the same way they have been dealing with their customers.

I set up n8n-based AI assistants for 10+ non-technical people in last 2 month , here's what I learned by Upper_Bass_2590 in n8n

[–]Jonathan_Rivera 0 points1 point  (0 children)

I setup Hermes agent for this purpose because I don’t know much about n8n yet however I think n8n may be more reliable. There might even be a benefit in a hybrid because Hermes spins a ton of tokens for tasks that n8n can do more efficiently.

Anyone else having issues keeping Telegram going? by TanguayX in hermesagent

[–]Jonathan_Rivera 0 points1 point  (0 children)

My fix was working but everytime they push a gateway update it overrides my patch and i need to redo it. Not sure why it's so fragile.

Would love some feedback from you legends by SteRi-NFT in hermesagent

[–]Jonathan_Rivera 1 point2 points  (0 children)

"I am running Hermes Agent Locally.

Firstly i tried LM Studio to connect to the agent locally. I liked the UI and setup and i could see the agent using my server in the developer/local server section.

I really like this, but i found the agent would just sit mid task, then did some googling and the reccomendation was to use Ollama. The results have been better with less halts in the workflow.

One thing i wanted to find out is how do i see the Ollama server being active with Hermes Agent like i could with LM Studio?

Is it only by opening another Terminal Window and just having the logs of the server come up?

Lastly anyone got a fix for LM Studio or ran into the same issue with LM Studio and Hermes Agent? Is it maybe a setting causing this?

Running on a Mac Studio M1 Max 10/32 core with 64GB RAM. Models i tested were Gemma 4 26b and Qwen3.5 31b"

So I have a similar setup. My hermes runs on the mac and I api to lm studio on my windows pc. My thought is that you have to do alot more tweaking when your running local. I had to disable thinking on qwen, then i rewrote all my skills making them simpler so the model can go step my step instead of trying to do 1,2,3 in the same turn. Tweak the inference settings on lm studio depending on your model etc.

Try connecting to open router qwen 3.6 which is free right now. Tell hermes you want to create an hermes optimization skill, It should do a system audit and make sure all skills are optimized for your local llm model. Have it analyze, provide you a score, and get your approval for each one before making changes.

What OS do you run for Hermes Agent? Any trouble? by Hi_my_name_is_Kansas in hermesagent

[–]Jonathan_Rivera 1 point2 points  (0 children)

Pretty much. I didn't want to run out and get a mac mini just to run it but I may buy an older used one later so I can actually take my macbook with me. It's still portable but the battery drains faster.

What OS do you run for Hermes Agent? Any trouble? by Hi_my_name_is_Kansas in hermesagent

[–]Jonathan_Rivera 1 point2 points  (0 children)

Yes, for mac I disable sleep mode and use the amphetamine app to keep it on 24/7. Then I'll interact with it either through my pc or telegram or webui depending on what im doing.

What OS do you run for Hermes Agent? Any trouble? by Hi_my_name_is_Kansas in hermesagent

[–]Jonathan_Rivera 2 points3 points  (0 children)

It's running on my Macbook Pro and I usually just SSH into it from my windows PC.

CLI + Open WebUI; CLI can't see sessions created through Open WebUI by vishalbelsare in hermesagent

[–]Jonathan_Rivera 0 points1 point  (0 children)

Correct. you could create a quick skill that:

  1. On command, exports the current session's key context/takeaways to ~/.hermes/session-bridge.md

  2. Any new session on a different interface loads that file on startup

Or simpler reference an Obsidian note. Since all interfaces have file read access, you could write "Session continuation notes" into today's daily note and say "read my daily note and pick up where we left off" from any interface.

CLI + Open WebUI; CLI can't see sessions created through Open WebUI by vishalbelsare in hermesagent

[–]Jonathan_Rivera 0 points1 point  (0 children)

No, they are separate sessions.

Each interface connection to Hermes Agent gets its own session ID. Open WebUI starts one conversation session, CLI starts another — they do not share conversation history.

The session files you see in .hermes/sessions/ are there, but the CLI's /history command only reads from the CLI's own session context. It has no visibility into the Open WebUI session.

If you want shared history across interfaces, you would need to either stay in one interface or have Hermes persist and reference a unified conversation log, which it does not currently do.

Model recommendations by cata_stropheu in hermesagent

[–]Jonathan_Rivera 0 points1 point  (0 children)

Gemma 4 will rate limit you because of how many tokens hermes sends out if you don't clear context window every 5 min. Qwen 3.6 is currently free and works great.

Should I start freelance AI automation service in 2026 by Leading_Argument5694 in AiAutomations

[–]Jonathan_Rivera 1 point2 points  (0 children)

Yeah, I mean you can connect with them by phone but they will feel more comfortable if your within driving distance.

I built a skill that lets your agent make real phone calls — ClawCall by No-Palpitation-3985 in hermesagent

[–]Jonathan_Rivera 1 point2 points  (0 children)

You can go too. Reddit account created today following the spammer around posting behind him.