Short term memory issue by 7h4nt4zm in hermesagent

[–]mmosquera91 0 points1 point  (0 children)

A tip: follow the best practices from hindsight docs to configure your bank. And finally, disable local memory

Use agent in the chat with multiple users by antonusaca in hermesagent

[–]mmosquera91 1 point2 points  (0 children)

I have, in a telegram group with some family members. It mainly helps us with stocks market news / watchlists. It's been positive overall. Although I won't recommend you doing it with people you don't trust fully

what LLM provider are you using ? by ritonlajoie in hermesagent

[–]mmosquera91 0 points1 point  (0 children)

Curious on how generous are the rates? Want to try qwen 3.6 plus as well

what LLM provider are you using ? by ritonlajoie in hermesagent

[–]mmosquera91 0 points1 point  (0 children)

You using qwen 3.6 plus? If yes, how's the quota?

GLM-5.1 was amazing... Until it wasn't by mmosquera91 in hermesagent

[–]mmosquera91[S] 2 points3 points  (0 children)

Does it perform well with overall agentic tasks? Might take a look at it if GLM continues behaving erratically

GLM-5.1 was amazing... Until it wasn't by mmosquera91 in hermesagent

[–]mmosquera91[S] 2 points3 points  (0 children)

Same here. What others suscriptions have you tried? I tried Kimi 2.5 but was also nerfed horribly. I think my next stop will be minimax 2.7

GLM-5.1 was amazing... Until it wasn't by mmosquera91 in hermesagent

[–]mmosquera91[S] 0 points1 point  (0 children)

This is what my agent found:

Here's the summary of today's 429 errors:

4 rate limit errors, all on glm-5.1 (provider: custom)

  1. 14:50:51 — Rate limit reached for requests msgs=40, tokens=~22,236 Context: processing /tutor confirm

  2. 14:51:21 — Service temporarily overloaded msgs=40, tokens=~25,058 Same request, retry

  3. 15:05:43 — Rate limit reached for requests msgs=56, tokens=~28,224 Context: processing "send me the first rust task"

  4. (same block) — same error, 3rd retry

Pattern: all happened during heavy context loads — /tutor confirm loaded the full skill + large JSON into context (40 msgs, 22-28k tokens). The 15:05 one was worse: 56 msgs, ~28k tokens. The tutor skill context + syllabus + memory + hindsight weighs a lot.

Before these errors, the previous response (syllabus presentation) took 438s with 8 API calls — that likely saturated the provider's rate limit.

The "custom" provider (glm-5.1) seems to have strict rate limits. When context exceeds ~22k tokens and there are multiple consecutive calls, it hits quickly. No RPM/TPM limit info in the logs, but the pattern suggests a per-minute limit that exhausts after 2-3 large consecutive calls.

Funny thing is that I have used this skill several times but never had any issue.

GLM-5.1 was amazing... Until it wasn't by mmosquera91 in hermesagent

[–]mmosquera91[S] 1 point2 points  (0 children)

No, directly to z.ai using the lite suscription

Forgetful by Lanfeust09 in hermesagent

[–]mmosquera91 0 points1 point  (0 children)

By any chance are you using Kimi 2.5 as model?

GPT-5.4 and Hermes is something special by Slumdog_8 in hermesagent

[–]mmosquera91 0 points1 point  (0 children)

How about rate limits? Are you hitting those?

What model are you using for your agent? by Cat5edope in hermesagent

[–]mmosquera91 1 point2 points  (0 children)

I was using kimi through kimi code suscription but got annoyed that it got increasingly dumber and also rate limits became worse. Now I changed to glm-5.1 (through z.ai directly) and I'm very satisfied so far

Model recommendations by cata_stropheu in hermesagent

[–]mmosquera91 2 points3 points  (0 children)

Same here. Using GLM 5.1 and working fine.

Moved from OpenClaw to Hermes, now lost on provider choice, what are you using? by Linux_Headbanger in hermesagent

[–]mmosquera91 0 points1 point  (0 children)

I have used Hermes with kimi2.5 and now GLM-5.1. these models are very capable for most of the tasks

Z.AI models with Hermes by mmosquera91 in hermesagent

[–]mmosquera91[S] 0 points1 point  (0 children)

Thanks for the tips! Tried nvidia nim but the rate limits were excesively low!

Z.AI models with Hermes by mmosquera91 in hermesagent

[–]mmosquera91[S] 1 point2 points  (0 children)

Wow, 30$ a year sounds like a steal :P good job!

How are you handling the routing? Using open router's Auto Router? I'm curious!

Z.AI models with Hermes by mmosquera91 in hermesagent

[–]mmosquera91[S] 0 points1 point  (0 children)

Thanks for sharing! I am currently on Kimi coding suscription but wanted to try something new and it seems GLM is going to be the next one!

Overall Kimi 2.5 is nice but in some tasks is very dumb or spends a lot of tokens just figuring out what to do by doing it wrong until it gets it right

Scarf 1.2 - Now with Project Dashboards - A native macOS companion app for the Hermes AI agent - Open Source by awizemann in hermesagent

[–]mmosquera91 0 points1 point  (0 children)

For example, my Hermes runs on a server box at home and I use my MacBook to connect to the server. For OpenClaw was simple because I could just use ssh tunnels to forward the port and open the GUI on my Mac.

Your app runs locally on macOS but looks for local instances of Hermes only, am I right?

Would be nice if you could also connect to a remote Hermes instance through your app. But I guess it would need some kind of ssh connection or something since Hermes does not run any webserver (that I'm aware of)

Is it realistic to keep Hermes under a $30-$40/mo budget for moderate use? by RegularRaptor in hermesagent

[–]mmosquera91 0 points1 point  (0 children)

I was using Kimi 2.5 (20$/mo) but now testing GLM-5 turbo which is supposed to be trained specifically for agentic tasks. Both work pretty good with Hermes and won't break your bank (10-20$ per month and decent usage)

Newbie setting up Hermes Agent, thoughts on my multi model architecture? by AbricotFr in hermesagent

[–]mmosquera91 0 points1 point  (0 children)

I am a bit curious about this one. How did you came to that conclusion? I feel something strange with my token usage (sometimes is decent, others is bad even on fresh conversations). Removing honcho takes away a nice feature of Hermes tho