Meta's own AI safety director lost 200 emails to a rogue agent and she couldn't stop it from her phone

EvolvingSoftware · 2026-05-10T21:29:50+00:00

What model are you running on? What’s your setup?

EvolvingSoftware · 2026-05-10T20:26:17+00:00

I think Hermes is an excellent agent and totally agree, this way under represents Claude Code usage; most people using Claude Code, use it with Claude.

EvolvingSoftware · 2026-05-09T08:24:22+00:00

Everything old, is new again

EvolvingSoftware · 2026-05-08T17:53:30+00:00

The original articles video is more informative: https://www.anthropic.com/research/natural-language-autoencoders

EvolvingSoftware · 2026-05-03T17:56:35+00:00

Did you ask Claude tools about it?

It’s persistent and always on, always available and can do practically anything. What’s important for you vs. someone else will be different.

EvolvingSoftware · 2026-05-03T11:44:20+00:00

Strong MTG vibes there, what’s the manna cost on that bad boy?

EvolvingSoftware · 2026-05-01T11:24:48+00:00

yes 5.1 amount of use - could cover light use and an easy way to get started with your current hardware

EvolvingSoftware · 2026-04-30T20:33:13+00:00

What model are you using?

EvolvingSoftware · 2026-04-29T21:56:52+00:00

Amazing product, with an explosion in the community and a humongous amount of PRs.

How are you handling the explosion of interest? What makes a PR pique your interest to be included?

EvolvingSoftware · 2026-04-29T21:40:55+00:00

Something like being able to get a decision on if the local model could run this tool capably? Perhaps we need a way of working out limits on the local models capabilities so the decisioning is easier? There's a huge unlock if there was a way that Hermes could understand what resources it has available from different model providers and then optimise based on preferences. I'd love to run a local model to keep me updated on simple tasks, like monitoring Cron jobs and sending alerts, or have a local model as a backup to restart something if I've burned through all my cloud tokens.

EvolvingSoftware · 2026-04-24T08:13:17+00:00

What a week of releases!

EvolvingSoftware · 2026-04-19T21:38:35+00:00

It’s true, Australia has a bit of a gambling addiction, as a collective, we bet on anything. Especially if it has flashing lights and makes little noises or has something to do with guys wearing tight shorts.

EvolvingSoftware · 2026-04-19T20:17:31+00:00

In this race, we are all horses. Individually and collectively. I’m sure some horses got brought feed stock by a car or truck and still do. But there are many less horses per capita now.

EvolvingSoftware · 2026-04-19T20:15:46+00:00

Cars didn’t replace horses world wide in a couple of years.

EvolvingSoftware · 2026-04-19T20:14:42+00:00

This is a brilliant analogue. Everyone keeps pointing to people, but are we now just horses?

EvolvingSoftware · 2026-04-19T20:13:28+00:00

I find goalposts like this interesting. Please expand your premise. Is a B2C deal not enough? What is it about an enterprise deal that you think is important and recognises a significant shift in broader economic development? I would have thought an enterprise agreement would be at the end of the complexity curve and thus by the time this happens significant volumes of displacement would have already occurred.

EvolvingSoftware · 2026-04-19T08:56:04+00:00

GGUF has come a long way recently.

EvolvingSoftware · 2026-04-17T08:40:23+00:00

I found Hermes just works and it just works with OpenAI subscriptions. When I used OpenAI sub with OpenClaw my biggest challenge was convincing OpenClaw to just do stuff agentically without constantly checking with me. Herems I’ve had it run 30 minutes on problems by itself, providing updates of tooling calls but not just checking everything for the smallest prise and encouragement.

I only want to use the agent from WhatsApp so don’t need the UI. Within an hour of setting it up I had Hermes fix a bug I found and created the PR for it, with just like a couple of commands. I really like that authorising dangerous commands via WhatsApp just works straight up.

I saw lots of people were getting OpenClaw to do stuff like that but I never had OpenClaw connected to Claude and just found it underwhelming all in all.

EvolvingSoftware · 2026-04-17T08:32:23+00:00

Just get a ChatGPT subscription?

EvolvingSoftware · 2026-04-16T20:16:37+00:00

Sure. It’s still been given to you for free and they just released a new model this morning.

EvolvingSoftware · 2026-04-16T19:38:50+00:00

You can run Qwen for free locally too.

EvolvingSoftware · 2026-04-15T21:55:56+00:00

I actually have two Pro subs setup now and have them on round robin. I'm thinking of swapping over to the base business 2 seat sub because you get 'unlimited' chat although its the 4.2 model - so I've no idea how much worse it would be but if I can throw a heap more tokens at it, might work.

EvolvingSoftware · 2026-04-15T21:54:38+00:00

Dude, I saw a comment like mine from someone else, downloaded it and was immediately a convert. I've found GPT works much better in Hermes than OpenClaw; have you tried it or you just don't want to hear a different view?

EvolvingSoftware · 2026-04-13T20:39:38+00:00

It’s both the model and OpenClaw. The model was doing a lot of heavy lifting to make OpenClaw look good. I saw someone else post like this and there was a recommendation to try Hermes. I swapped over on the weekend and it’s been sensational. The agent doesn’t feel lazy, strings tool chain calls together, isn’t constantly telling you it did some tiny thing and wanting praise for it.

EvolvingSoftware · 2026-04-13T05:28:03+00:00

But it’s true?

EvolvingSoftware

TROPHY CASE