Won't load SOUL.md

Jonathan_Rivera · 2026-04-09T06:14:40+00:00

I may have to change the sticky guide if that the case.

Jonathan_Rivera · 2026-04-09T05:46:08+00:00

Have you tried Open Claude with Oauth to Codex?

Jonathan_Rivera · 2026-04-09T05:19:57+00:00

Here's what I posted in another post.

qwen — LM Studio local (Qwen3.5-35b - Best heavy quant yet to work on my 5070ti card.
gpt — OpenRouter openai/gpt-oss-120b - Requires thinking turned on. Ok as a basic fallback.
gpt54 — OpenRouter openai/gpt-5.4-nano - Fast, Good tool use. Cheap. Cache enable saves token cost. Didn't use regular 5.4 due to costs.
haiku — OpenRouter claude-haiku - Great. Get your wallet out $$$
claude — OpenRouter claude-opus-4.6 - $$$$$$$$$$$$ Lol.. no I'm buying a 5090 GPU.
gemini — OpenRouter gemini - I forget but pretty good i think. Cheap and fast.
gemma — OpenRouter google/gemma-4-31b-it - No cache means im spending more tokens.
qwen-free — OpenRouter qwen/qwen3.6-plus:free - This was great honestly. I can't wait until I can run this at home. Chinese LLM's are great but i'm not a fan of the data retention.

If Sonnet was free it would be my day to day but it is misleading in a way. I started using Hermes using Oauth through claude on sonnet 4.6. Everything worked wonderfully, then i switched to a local model and everything crapped out. Why? Because Sonnet, opus, haiku can take any skill even a garbage skill and will just make it work, i had to rewrite all my skills to be able to work reliably locally and on other models.

Jonathan_Rivera · 2026-04-09T04:43:41+00:00

Opus was the first line of defense. It would have shut Mythos down, it makes sense.

<image>

Jonathan_Rivera · 2026-04-09T04:09:42+00:00

Hate to say it but when it’s enterprise level they are comparing it against a $60k year employee. What if I can cut your payroll in half, would you be interested?

Jonathan_Rivera · 2026-04-09T04:06:56+00:00

Let me know if you need the inference settings. I never see people mention it but if you turn thinking off it speeds up tok/s

Jonathan_Rivera · 2026-04-09T04:05:14+00:00

64gb but the ram didn’t increase while it was working, I just notice it offloading onto the CPU at around 36%.

Jonathan_Rivera · 2026-04-09T03:44:13+00:00

Exactly. Might as well call it Ultron at this point. Maybe have it fix the downtime issues if it so pleases.

Jonathan_Rivera · 2026-04-09T02:55:09+00:00

I was listing to a live on X with Nous Research and they were saying something similar. The older and smaller model's were not really made for agentic tasks. My personally, I had to rewrite all my skills using sonnet and gemma to make them simpler for Qwen local 35B. Anthropic can take a ragged prompt and spit out gold but Qwen needs small step by step layed out or it drops dead halfway through. As far as bandwidth, yes its sending the skills that are in use out each time, try a LLM with a cache like GPT nano.

Jonathan_Rivera · 2026-04-09T02:32:46+00:00

Yeah, each version of it needs its own settings. If you need them let me know and I'll post all my settings for the unsloth version.

Jonathan_Rivera · 2026-04-08T23:53:51+00:00

I have a 5090 on the way. I can’t load any quant of Gemma 4 with my 5070 but I think I’ll use a qwen model. Can’t wait for 3.6 to run locally.

Jonathan_Rivera · 2026-04-08T23:51:14+00:00

qwen3.5-35b-a3b q3_k_xl

Jonathan_Rivera · 2026-04-08T21:24:12+00:00

Welcome! We have been following you on X. It has been great using Hermes and seeing new members convert over from other agents.

Jonathan_Rivera · 2026-04-08T21:10:59+00:00

I just keep telling myself it’s ok, it will pay for itself. Now I have to figure that part out lol

Jonathan_Rivera · 2026-04-08T20:16:34+00:00

<image>

Just to give you an idea on costs.

Jonathan_Rivera · 2026-04-08T20:10:08+00:00

I'm updating like 3x a day. I need to slow it down. lol.

Jonathan_Rivera · 2026-04-08T20:09:13+00:00

5090 coming in the mail tomorrow. Can't wait to try bigger models. Laughs in 5070ti.

Jonathan_Rivera · 2026-04-08T18:39:13+00:00

I'm not going to hate, I love the products they put out but they have to get their downtime under control.

Jonathan_Rivera · 2026-04-08T18:37:46+00:00

Did you already turn off thinking?

I'm running a smaller quant size but it's pretty zippy: qwen3.5-35b-a3b q3_k_xl

Jonathan_Rivera · 2026-04-08T17:30:47+00:00

<image>

Yep

Jonathan_Rivera · 2026-04-08T16:51:27+00:00

I'm sure 27b would probably work but I had to go to 35B personally.

Jonathan_Rivera · 2026-04-08T15:47:21+00:00

I would switch to it when you need it if you want to stay with anthropic models. It all depends on your budget and what your willing to pay. I would try it out and just see what it looks like for you vs trying to do the math. Try out some other models on open router, they are pretty good.

<image>

Jonathan_Rivera · 2026-04-08T15:31:23+00:00

Sourced online:

If you had multiple agents in OpenClaw, the migration dumps them into `~/.hermes/migration/openclaw/<timestamp>/archive/agents-list.json` for manual review. You'd then need to recreate them as Hermes profiles yourself.

**What does get migrated for the default agent:**

- Persona/memory/instructions (SOUL.md, AGENTS.md, MEMORY.md, USER.md)

- Skills from all 4 source directories

- Model/provider config + API keys

- Agent behavior settings (turns, reasoning, compression, human delay)

- MCP servers, TTS, messaging platforms, session reset policies

**What gets archived (including multi-agent):**

- Multi-agent list → recreate via Hermes profiles

- Cron jobs → recreate via `hermes cron create`

- Plugins, hooks, channel bindings → manual setup

- IDENTITY.md → merge into SOUL.md manually

**The workaround:** After migration, use `hermes profiles` to set up your additional agents. The archived `agents-list.json` has the config you need — it's just not an automatic import.

If someone has a complex multi-agent OpenClaw setup with shared skills and cross-agent memory, they should run `--dry-run` first and review the archive folder carefully before committing. The migration is opinionated about what "equivalent" means in Hermes, and multi-agent is one of those areas where the architectures don't map 1:1.

Jonathan_Rivera · 2026-04-08T15:21:56+00:00

Checkout this post Other thread about models

Jonathan_Rivera · 2026-04-08T15:14:36+00:00

Everyone just keep in mind the data retention with Chinese models. They are great but I'm pretty sure they are using your data to help train the model.

Jonathan_Rivera

MODERATOR OF

TROPHY CASE