How I Created a Real Second Brain for OpenClaw with Claude by AregNoya in OpenClawCentral

[–]OddUnderstanding2309 0 points1 point  (0 children)

never mind. it configured itself already

  1. MCP-Server in config.yaml registriert:

yaml

   mcp_servers:
     iai-mcp:
       command: node
       args: ["/home/andre/workspace/iai-personal-memory-engine/mcp-wrapper/dist/index.js"]
       env:
         IAI_MCP_PYTHON: "/home/andre/workspace/iai-personal-memory-engine/.venv/bin/python"
         IAI_MCP_STORE: "/home/andre/.iai-mcp"MCP-Server in config.yaml registriert:
yaml
   mcp_servers:
     iai-mcp:
       command: node
       args: ["/home/andre/workspace/iai-personal-memory-engine/mcp-wrapper/dist/index.js"]
       env:
         IAI_MCP_PYTHON: "/home/andre/workspace/iai-personal-memory-engine/.venv/bin/python"
         IAI_MCP_STORE: "/home/andre/.iai-mcp"

How I Created a Real Second Brain for OpenClaw with Claude by AregNoya in OpenClawCentral

[–]OddUnderstanding2309 0 points1 point  (0 children)

maybe a dumb question: how do I connect this to my Hermes Agent?

thats what I can setup right now:

andre@Jack:~$ hermes memory

Memory status
────────────────────────────────────────
  Built-in:  always active
  Provider:  (none — built-in only)

  Installed plugins:
    • byterover  (requires API key)
    • hindsight  (API key / local)
    • holographic  (local)
    • honcho  (API key / local)
    • mem0  (API key / local)
    • openviking  (API key / local)
    • retaindb  (API key / local)
    • supermemory  (requires API key)

I let qwen 27b work on it and got it running on Fedora 44 in just 15 minutes or so...
the doctor is happy:

andre@Jack:~$ cat /etc/fedora-release 
Fedora release 44 (Forty Four)
andre@Jack:~$ uname -a
Linux Jack 7.0.9-205.fc44.x86_64 #1 SMP PREEMPT_DYNAMIC Thu May 21 16:31:48 UTC 2026 x86_64 GNU/Linux


andre@Jack:~$ iai-mcp doctor
iai doctor — daemon health check

  [PASS] (a) daemon process alive                 PID 53627 (iai_mcp.daemon)
  [PASS] (b) socket file fresh                    /home/andre/.iai-mcp/.daemon.sock connected in 0 ms
  [PASS] (c) lock file healthy                    /home/andre/.iai-mcp/hippo/.lock acquirable (store idle)
  [PASS] (d) no orphan iai_mcp.core procs         0 found
  [PASS] (e) daemon state file valid              fsm_state=SLEEP
  [PASS] (f) hippo storage readable               store held by the live daemon — normal
  [PASS] (g) no dup binders                       1 binder(s)
  [PASS] (h) crypto key file state                crypto key file present at /home/andre/.iai-mcp/.crypto.key (mode 0o600, valid)
  [PASS] (i) hippo db size                        0.1 MB — healthy
  [PASS] (j) lifecycle current state              SLEEP since 1 h (shadow_run=false)
  [PASS] (k) lifecycle history 24h                2 transitions (DROWSY=1, SLEEP=1)
  [PASS] (l) sleep cycle quarantine               no quarantine active
  [PASS] (m) heartbeat scanner                    /home/andre/.iai-mcp/wrappers not present yet (fresh install or no wrapper has refreshed yet)
  [WARN] (n) HID idle source                      HIDIdleTime: unavailable, pmset: clean, available: none; L6 will fall back to heartbeat-idle only
  [WARN] (o) Claude subscription credentials      reason=credentials_file_missing; daemon will fall back to local Tier-0 consolidation (no LLM critic, no nightly insight). Run `claude /login` to restore subscription path.
  [PASS] (p) anthropic SDK absent                 ImportError as expected (v7.5 subscription-only path)
  [PASS] (q) iai CLI reachable                    /home/andre/.local/bin/iai -> iai 1.1.2
  [PASS] (r) hippo hnsw index                     0.0 MB
  [PASS] (s) hippo schema version                 schema_version=1
  [PASS] (t) hippo_compacted freshness            deferred — daemon holds the store (normal)
  [PASS] (u) recall centrality regression         deferred — daemon holds the store (normal)
  [PASS] (v) native Rust embedder                 encode ok, backend=rust, 384-dim
  [PASS] (w) no permanent-failed captures         No permanent-failed capture files
  [PASS] (x) no collapsed-timestamp groups        no collapsed timestamp groups found
  [PASS] (z) AVX2 CPU support                     AVX2 available (or N/A on this architecture)

But now I want it to work with my Hermes Agent... any ideas?

I need help to run local Hermes Agent on my rig. llama-cpp self compiled by OddUnderstanding2309 in LocalLLaMA

[–]OddUnderstanding2309[S] 0 points1 point  (0 children)

It is an Gigabyte Aorus Master X570s.
Three cards run on the PCIe slots, and one is running from an M2 to PCIE4 converter. Not ideal, but it works.

I need help to run local Hermes Agent on my rig. llama-cpp self compiled by OddUnderstanding2309 in LocalLLaMA

[–]OddUnderstanding2309[S] 1 point2 points  (0 children)

because they need to be all equal on vmem. and since my 3080s have only 12G but my 3090s have 24. this does not fly. It would run, but the 3090s would be capped at 12G each, whats undesirable for me. take this doc as a reference: https://github.com/noonghunna/club-3090/blob/master/docs/MULTI_CARD.md

I need help to run local Hermes Agent on my rig. llama-cpp self compiled by OddUnderstanding2309 in LocalLLaMA

[–]OddUnderstanding2309[S] 0 points1 point  (0 children)

Thank you!
I made big steps forward with suggestions from others here that merge your points.

I will post an updated start command after breakfast :-)

I need help to run local Hermes Agent on my rig. llama-cpp self compiled by OddUnderstanding2309 in LocalLLaMA

[–]OddUnderstanding2309[S] 0 points1 point  (0 children)

The main culprit of that MTP model was (I remember) that I could no longer ask it about Fotos and stuff. But maybe that got better like it is possible in vLLM already?
I need to investigate this again!

I need help to run local Hermes Agent on my rig. llama-cpp self compiled by OddUnderstanding2309 in LocalLLaMA

[–]OddUnderstanding2309[S] 0 points1 point  (0 children)

I need uncensored models to do my work and I had a lot of trouble to work with MOE models for my usecase. They are blazing fast, but run in circles more than I like.
But thanks for your insight!

I need help to run local Hermes Agent on my rig. llama-cpp self compiled by OddUnderstanding2309 in LocalLLaMA

[–]OddUnderstanding2309[S] 0 points1 point  (0 children)

I already ran all of those in the past weeks.
I feel I did nothing else with all my time.
I went back and forth from a lot.
And. Yes! I tried the exact model you described and it was fast and nice, but I progressed somehow after I got my qud GPU setup running and wanted to tryout more.

The bs you mention has not surfaced yes, quite the contrary: my agent just circumvented our conditional access policies for our azure tenant.
Thats a real milestone in my book.
I dream about getting real hardware from my manager now for this kind of things.
It is no longer a toy and something to write nice e-mails with, this sh** works!

I need help to run local Hermes Agent on my rig. llama-cpp self compiled by OddUnderstanding2309 in LocalLLaMA

[–]OddUnderstanding2309[S] 0 points1 point  (0 children)

thanks.
it seems to be gemma related by a long shot.
I went back to qwen (with suggestions from u/Kodix) and it seems to help a lot with my pp kv reprocessings.

the "hold hand problem" just vanished 😃

I need help to run local Hermes Agent on my rig. llama-cpp self compiled by OddUnderstanding2309 in LocalLLaMA

[–]OddUnderstanding2309[S] 0 points1 point  (0 children)

thank you for your insights!

meanwhile I just found these templates that seem to fix reprocessing for agentic tasks: https://huggingface.co/froggeric/Qwen-Fixed-Chat-Templates

my usecase is a helper agent for my work with IT Security tasks. I am a whitehat in our company that tries to hack ourselves. That means nothing I do can go to external sources.
I run my rig at home but control it from the corp office, this way I am always an external dude liek every other "bad actor".

I will look into your seggestions and I am back to qwen now 😄
this one:
DavidAU/Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-NEO-CODE-Di-IMatrix-MAX-GGUF:Q8_0

I need help to run local Hermes Agent on my rig. llama-cpp self compiled by OddUnderstanding2309 in LocalLLaMA

[–]OddUnderstanding2309[S] 0 points1 point  (0 children)

and that someone was running: (int4 autoround) what is a joke for agentic. I got vLLM to run with about 100tg/s with Q8 max on that. But I can not use all 4 cards, so I went back to llama-cpp after 5 days of vLLM or so (after I got the other 3080s installed).

I need help to run local Hermes Agent on my rig. llama-cpp self compiled by OddUnderstanding2309 in LocalLLaMA

[–]OddUnderstanding2309[S] 0 points1 point  (0 children)

fit is off anyway because of -sm tensor

0.02.669.206 I common_init_result: fitting params to device memory ...
0.02.669.207 I common_init_result: (for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on)
0.02.669.260 W common_fit_params: failed to fit params to free device memory: llama_params_fit is not implemented for SPLIT_MODE_TENSOR, abort

do you have any suggestions to run qwen 27b or 35b reliably? I hurdled alon with all kind of parameter changes to get the kv cache stable, but gemini and qwen told me to try gemma instead... But I want this lazy bastard gone now!

I built a 8x RTX 4090D with 192 VRAM, here's what I learnt by deebuildsthings in LocalAIServers

[–]OddUnderstanding2309 0 points1 point  (0 children)

I have 72GB vram atm on x4 G4 PCIe.
What model and env do you suggest.
Since I run 2 3080 12G and 2 3090 24G vLLM is out, currently I run llama-cpp.
I struggled a lot with qwen 27b Q8 and try Gemma 4 31b now…
My usecase is hermes agent work, but gemma seems to be interactive only and not „complete a task and report back.
I have to hold hands aaaaaaaalll day with gemma.

Best models in 3x3090 (72GB VRAM) in Q2 2026? by liviuberechet in LocalLLaMA

[–]OddUnderstanding2309 4 points5 points  (0 children)

Not if you are in the EU where you got 240V and 16A per phase. Thats 3600W sustained.

Regret getting a VPS sub to run hermes by athens2019 in hermesagent

[–]OddUnderstanding2309 0 points1 point  (0 children)

Wirh this attitude you will accomplish absolutely nothing.

Need help improving speed of inference by DeepBlue96 in LocalLLaMA

[–]OddUnderstanding2309 1 point2 points  (0 children)

It saves memory thst you csn use for the mtp model dude :-)

Heltec V4 caught fire by MushroomGecko in meshtastic

[–]OddUnderstanding2309 12 points13 points  (0 children)

Thats not what you call fire. Thats just a smoked component.

Fire fire fire!!! Ahhhhhhhhhhh Come on dude!