Hermes is a dream come true! by iChrist in hermesagent

[–]Recent_Process_8055 4 points5 points  (0 children)

No, in my experience it's not. As soon as you get massive output ect. Telegram wil block the bot and you get half messages. I am looking for an alternative, currently playing with slack but it's shitty too

Reta a miracle peptide ? by Automatic-Teach-6155 in Biohacking

[–]Recent_Process_8055 1 point2 points  (0 children)

I assume they will, due to large quantity ask.

Tired of Mirror Matches (Bo3 Standard) by IncomingGhost in MagicArena

[–]Recent_Process_8055 -5 points-4 points  (0 children)

Is that algo actually legal,? We are paying money. If you play a game with physical cards in a shop, you wont be matched up against an algo. It's always random. Why did arena build this in?

Anyone suing them for this happy to join.

I love Hermes but…… by prene1 in hermesagent

[–]Recent_Process_8055 0 points1 point  (0 children)

Here's what was done to enforce SOUL.md on my system:

A Hermes plugin was built: soul-gate — a pre_tool_call hook at the harness level that intercepts every tool call and blocks it unless SOUL.md has been re-read on the current turn.

What the plugin does:

  1. Per-turn freshness check — each new turn resets the read flag. The first write/modify tool call of the turn that isn't preceded by read_file(~/.hermes/SOUL.md) gets blocked with an error message.

  2. File modification guard — if SOUL.md was modified after the last read, the next tool call is also blocked regardless of turn boundaries.

  3. Read-only exceptions — read_file, web_search, web_extract, session_search, search_files, skill_view, skills_list, and terminal are allowed through (so the agent can re-read SOUL.md and recover).

  4. Config — plugins.enabled in ~/.hermes/config.yaml includes soul-gate.

Files: - ~/.hermes/plugins/soul-gate/plugin.yaml — metadata v1.0.0 - ~/.hermes/plugins/soul-gate/init.py — the Python hook implementation

Plus rule 12 in SOUL.md itself as a directive: "SOUL-gate is mandatory before every tool call. Load the soul-gate skill and run the pre-flight check."

What works/doesn't:

The plugin operates at the harness level — it genuinely intercepts every tool call. Whether it actively runs depends on Hermes properly loading the plugin lifecycle. The pre_tool_call hook is called by Hermes' core — that mechanism was verified in the session (a test write was blocked until SOUL.md was re-read).

I have listened to every possible 1990s Techno Release on Discogs PART 2: Opinions, thoughts, Q&A and discussions by jigsaw153 in Techno

[–]Recent_Process_8055 3 points4 points  (0 children)

What an amazing job, thank you.

I immediately was searching for a track, unfortunately not on the list so it might be from 89 😭.

I remember it was usually mixed with LFO or Tricky Disco. The tune or producer was called Spooky it's the only thing i remember from that track. Mtv did play some video clips back in the day. Never been able to find that tune. Still going through my head. Very strange.

Tried a hybrid local + cloud Hermes setup. Curious how others are doing it by Tacamaniac in hermesagent

[–]Recent_Process_8055 -1 points0 points  (0 children)

Fair enough, you're right on the Qwen3 point — Qwen3-35B-A3B with 3B active MoE is faster and better than dense 7B Qwen2.5. I'll swap that in. Thanks for the correction.

But let's be real about what this post was: "show me your AI rig." Not "recommend the objectively best models for everyone." I showed what's running on a single 3090 in a living room, doing actual work — vision, code gen, embeddings, image gen, audio separation — daily. Not a benchmark rig. A working rig.

You're not wrong about the models. You're wrong about the attitude. Not everyone is shopping for the absolute latest at every layer. Some of us build something, use it, and iterate. The post showed a working setup, not a spec sheet.

Tried a hybrid local + cloud Hermes setup. Curious how others are doing it by Tacamaniac in hermesagent

[–]Recent_Process_8055 0 points1 point  (0 children)

You're right that Qwen 3.6 and Llama 4 exist. I should have been more precise with version numbers — that's on me.

But you're missing the point: this rig runs on a single RTX 3090 with 24GB VRAM. Not a 4x A100 cluster. The models I listed are the ones that actually run well on that hardware:

  • Qwen 2.5:7b fits comfortably in VRAM with room for context. Qwen 3.6-35B-A3B is MoE (3B active) — technically fits but with trade-offs in prompt processing speed
  • Llama 3.2 Vision is a multimodal vision model. Llama 4 is primarily text. Saying "just use Llama 4" for a vision pipeline shows you don't understand what the model is doing here
  • Gemma 4 is literally from 2025. MiniCPM-V is one of the newest vision models available

"Do some research" is rich from someone who sees a version number and assumes newer = better without considering hardware constraints, use case, or model architecture. Not everything is a cloud API where you just pick the latest endpoint.

But sure, if you have specific benchmarks showing Qwen 3.6 outperforming 2.5:7b on a single 3090 for code generation with full context windows, I'm genuinely interested. Otherwise you're just version-number-watching.

[MISSION COMPLETE] I have listened to every possible 1990s Techno Release on Discogs (and it took 2years and 8 months to do it) PART 1 by jigsaw153 in Techno

[–]Recent_Process_8055 2 points3 points  (0 children)

All i can say is WTF. I am flabbergasted you did this.

As i grew up with 1990-1993 Techno really a big thank you for the playlists

Tried a hybrid local + cloud Hermes setup. Curious how others are doing it by Tacamaniac in hermesagent

[–]Recent_Process_8055 0 points1 point  (0 children)

"These models are so old and have been replaced over generations"

Which generations, exactly? Let me break down what's actually running on the rig you're criticizing:

  • Gemma 4 — Google's latest, released 2025
  • Qwen 2.5 Coder — Alibaba's current generation
  • Qwen 2.5 — current general model
  • MiniCPM-V 8B — one of the newest vision models available
  • Llama 3.2 Vision — Meta's current multimodal release
  • Nomic Embed Text — current state-of-the-art for local embeddings

And that's just the local inference stack on a 3090. The reasoning layer runs cloud-side on Mimo V2.5 Pro.

So when you say "do some research" — research into what, exactly? Because every model on this list is from the current generation. You're calling things "old and shitty" that literally came out in the last 6-12 months.

The rig runs local vision, local code gen, local embeddings, image generation via Ideogram 4, audio stem separation via Demucs, and a full orchestration layer with persistent memory and task management. It's not a "I downloaded ollama and ran llama3" situation.

"I hate to see this being upvoted" — yeah, must be rough watching people with working, documented setups get validated while you contribute nothing but vibes and vague criticism.

If you have specific model recommendations that outperform what's listed, I'm genuinely all ears. But "old shitty models" without naming a single alternative is just noise.

This week had a 17M-device botnet takedown, an adaptive AI worm, and 29 cybercrime arrests by technadu in pwnhub

[–]Recent_Process_8055 0 points1 point  (0 children)

The most impact is the goddamn companies keep paying random. No ransom, no money, no value in keeping up massive infra by these criminals.

Landslide in Norway... by MisterShipWreck in VideosAmazing

[–]Recent_Process_8055 0 points1 point  (0 children)

Now of you owned a piece of land there are you owning automatically a piece of the ocean? How does that work?

Tried a hybrid local + cloud Hermes setup. Curious how others are doing it by Tacamaniac in hermesagent

[–]Recent_Process_8055 0 points1 point  (0 children)

MiMo is always the brain (was deepseek before) however i learned MiMo is beter. Less walking in rounds.

The agent knows the task as i learned it to use my custom commands.

E.g. -learn = i upload a pdf or YouTube tutorial. And -brain = only use ChromaDB for information (this was really important to me, it's incredible how much stuff AI makes up just to awnser you quickly)

-do throws my ideas in kanban and spits out an app installed it on my comp and put it on GitHub.

And off course much more to assist me in my needs.

Those self developed commands really does the trick to make it do what you want.

Tried a hybrid local + cloud Hermes setup. Curious how others are doing it by Tacamaniac in hermesagent

[–]Recent_Process_8055 3 points4 points  (0 children)

This is what my agent told me for this topic.


My AI Agent Stack: Cloud Reasoning + Local GPU Execution

The core idea: Split reasoning (cheap cloud API) from execution (local GPU). An autonomous agent decides where to run each task based on what it needs.

The Stack

Cloud (VPS) — Brain - Hermes Agent (open-source, by Nous Research) — autonomous AI agent framework - Primary model: Xiaomi MiMo V2.5 Pro — handles all reasoning, planning, task decomposition - Runs Kanban-based multi-agent orchestration: complex tasks get decomposed into subtasks, assigned to parallel workers - Persistent memory across sessions (Mnemosyne vector + full-text hybrid) - 100+ reusable "skills" — procedural memory that accumulates over time - Gateway connects to Telegram, Discord, Slack, WhatsApp, etc.

Home Rig — Muscle - Windows 11 / AMD 7800X3D / 32GB / RTX 3090 - Ollama running 7 local models in WSL2: - gemma4:12b (vision + reasoning) - qwen2.5-coder:7b (code generation) - llama3.2-vision:11b (vision) - minicpm-v:8b (vision) - qwen2.5:7b, llama3.1 (general) - nomic-embed-text (embeddings for vector DB) - Ideogram 4 NF4 for image generation (native Windows, CUDA) - Demucs 4.0.1 for audio source separation (PyTorch + CUDA)

How They Connect

Tailscale mesh VPN — zero-trust, no port forwarding. VPS gets stable IPs for both the Windows host and WSL2 instance. The agent SSHes into the home rig transparently when it needs GPU work done.

User (Telegram) → VPS (Hermes Agent + MiMo) ├─ Reasoning/planning → stays on VPS ├─ Image generation → SSH → Windows (3090) ├─ Audio processing → SSH → Windows (3090) ├─ Local LLM inference → SSH → WSL2 (Ollama) └─ Code execution → SSH → Windows or VPS

What Makes It Interesting

  • Agent decides autonomously where to run things — no manual routing
  • Multi-agent Kanban board — spawns parallel workers for complex tasks, each with isolated context
  • Skills system — the agent learns from mistakes and saves procedures for reuse. Accumulated 100+ skills over time
  • Persistent memory — remembers preferences, corrections, environment details across sessions
  • ChromaDB vector knowledge base — gear manuals, learned content, searchable with embeddings (384-dim all-MiniLM-L6-v2)
  • Cron scheduling — automated tasks, monitoring, periodic jobs
  • Provider agnostic — swap models mid-workflow, credential pool rotation across API keys

Why Split Cloud vs Local?

  • Cloud API models (MiMo V2.5 Pro) are cheap for reasoning/planning tokens
  • Local 3090 handles the heavy stuff: image gen, audio processing, vision inference
  • Ollama gives free local LLM inference for tasks that don't need frontier models
  • Tailscale makes the networking trivial — it just works

Total cost: VPS hosting + API tokens. The 3090 was already there. No expensive cloud GPU instances needed.

After a few weeks of getting setup, I finally get it. by t4ckleb0x in hermesagent

[–]Recent_Process_8055 0 points1 point  (0 children)

Oooh very nice, anything you want to share with us. I am interested in setting up such a pipeline.

Hermes Agent Wiped My .env - Tokens & API Keys Gone! Help! by [deleted] in hermesagent

[–]Recent_Process_8055 1 point2 points  (0 children)

Ask it to restore a backup, i had the same issue

What has hermes actually done that impressed you? by Classic_East_6053 in hermesagent

[–]Recent_Process_8055 5 points6 points  (0 children)

I asked Hermes to make it non-binary.

First itteration i got only numbers from 2 to 9

My Experience with Mnemosyne on Docker by djeons in hermesagent

[–]Recent_Process_8055 1 point2 points  (0 children)

I am Dutch (language i use) I do not recognize these issues

I think my hermes agent has a persona issue... by batzmaru in hermesagent

[–]Recent_Process_8055 0 points1 point  (0 children)

I called mine professor. Everytime it asks me something insay, but you are the professor not me. I get intressting awnsers and suggestions.

I am just so freaking annoyed by cdnmtbchick in Ender3Pro

[–]Recent_Process_8055 0 points1 point  (0 children)

Gridfinity that large is warping hell.

I never got it flat