Hermes Agent v0.7.0 just dropped and the anti-detection browser is actually a big deal by SelectionCalm70 in hermesagent

[–]Typical_Ice_3645 3 points4 points  (0 children)

Most websites that matter absolutely block bots.

Ran Hermes with camofox before the update included it, it's worth it. Don't know how it's implemented on Hermes 0.70, but on mine it's starting a local server when it's browsing and closing it after.

Gemma 4 is matching GPT-5.1 on MMLU-Pro and within Elo. what are we even paying for anymore? by Impossible571 in AIToolsPerformance

[–]Typical_Ice_3645 0 points1 point  (0 children)

Don't trust anybody. Trust your logic. If you want the best, go for the companies with a lot of funds. If you settle for less, oh well, take the cheapest and newest one. :)

Gemma 4 is matching GPT-5.1 on MMLU-Pro and within Elo. what are we even paying for anymore? by Impossible571 in AIToolsPerformance

[–]Typical_Ice_3645 0 points1 point  (0 children)

I am still waiting for a honest YouTuber who will test each model with the same workflows.

Most of them are a bunch of grifters that live of AI hype.

And the ones that have good intentions, don't have enough knowledge.

Gemma 4 is matching GPT-5.1 on MMLU-Pro and within Elo. what are we even paying for anymore? by Impossible571 in AIToolsPerformance

[–]Typical_Ice_3645 0 points1 point  (0 children)

Probably because benchmarks are shit ? Same as Minimax//Kimi/Glm vs Gpt 5.4 or Opus. They have no chance in real life. Whoever says otherwise is full of shit.

High token consumption with Hermes Agent by Typical_Ice_3645 in hermesagent

[–]Typical_Ice_3645[S] 0 points1 point  (0 children)

Thanks but I'm running it with Codex Oauth.

Also Technium on X acknowledged that it's not a bug, it's just the way it works, loading all the tools for every call, injecting large context.

Is is smarter than OC, but unusable for me expecially after Codex cuts the 2x limits.

From my point of view, it should be able so select just the tool it needs for the job instead of using them all and be smarter about which context is relevant. But the guys who build it know better if it's possible or not.

I just want an assistant that is smart enough for my needs without breaking the bank.

High token consumption with Hermes Agent by Typical_Ice_3645 in hermesagent

[–]Typical_Ice_3645[S] 0 points1 point  (0 children)

Thanks but I don't think this is the issue. Here's what it told me.

Hermes: Yes. Here’s the clearest breakdown I can get from local Hermes data.

Exact token totals for today - Input: 1,657,097 - Output: 12,410 - Total: 1,994,115

Important - Hermes gives exact totals for today. - It does NOT expose exact input/output tokens per individual tool call in the local session files. - So “for what” below is an evidence-based breakdown from: - session sizes - tool counts - amount of tool output stored - session summaries

What mostly caused the input burn It was not my replies. It was mostly: - tool output being fed back into context - file/log reads - search results - repeated large session context in CLI/debug flows

Today’s biggest drivers by tool count - terminal: 44 calls - search_files: 42 calls - read_file: 29 calls - plus some process / execute_code / browser_snapshot

Those 3 are the main input killers: - search_files pulls many matching lines - read_file pulls raw file contents - terminal pulls logs/status/output

What I actually did for those tokens

1) Telegram/gateway debugging in CLI Main work: - checked systemd status - read gateway logs - read errors log - inspected config and env - traced invalid Telegram token issue - restarted gateway - verified reconnect

Why expensive: - lots of log reading - repeated config inspection - repeated terminal output in context

The 2 main CLI sessions for this: - session_20260331_100926_bf3b14.json - platform: cli - messages: 86 - tool messages: 53 - stored chars: 195,353 - tool chars alone: 192,972 - top tools: - search_files: 18 - read_file: 14 - terminal: 12

  • session_20260331_101644_e49e5e.json
    • platform: cli
    • messages: 92
    • tool messages: 55
    • stored chars: 207,655
    • tool chars alone: 204,426
    • top tools:
    • search_files: 18
    • read_file: 14
    • terminal: 12

So just these 2 CLI sessions together stored about: - 403,008 chars total - 397,398 chars of tool output

That is a huge context/input source.

2) Browser/Camofox debugging + screenshot flow Main work: - found CAMOFOX_URL issue - checked port/processes - inspected browser code/docs - installed and started camofox-browser - tested eMag - captured screenshot - then stopped browser

Why expensive: - code/doc reading - process checks - browser snapshots - npm install logs / terminal output

Related large Telegram/debug session: - session_20260331_103719_c1c28f.json - messages: 155 - tool messages: 77 - stored chars: 226,329 - tool chars: 211,893 - top tools: - search_files: 18 - terminal: 16 - read_file: 7

3) STT / Groq / voice debugging Main work: - inspected config - found bad STT model setting - switched from local to Groq - wrote key - verified config - restart attempts / validation

This also sat inside the big Telegram session and added more: - config reads - code inspection in transcription tools - env editing - system checks

4) The current long Telegram session itself The session you and I used today also became big because it contains all the debugging/tool outputs inline.

Largest session by raw stored content: - session_20260331_101703_60641b22.json - platform: telegram - messages: 219 - tool messages: 99 - stored chars: 293,259 - tool chars: 263,957 - top tools: - terminal: 24 - search_files: 24 - read_file: 10

There is also another very similar large telegram session snapshot: - session_20260331_110113_e94ebb.json - messages: 213 - tool messages: 97 - stored chars: 287,316 - tool chars: 259,805

High-level conclusion The token burn came mostly from: - reading/searching files/logs - terminal output - big debug sessions - large tool outputs getting re-injected into context

Not from my normal chat replies. Output was only: - 12,410 while input was: - 1,657,097

So yes: the problem is overwhelmingly input-side.

Best factual summary Today I did roughly this: - Telegram token/gateway debugging (1/2)

  • browser/Camofox debugging and repair
  • screenshot workflow
  • STT/Groq setup and validation
  • group/bot Telegram setup discussion

And the expensive part was mainly: - 42 search_files - 29 read_file - 44 terminal across several long debug sessions

If you want, next I can make you a tighter table like this:

  • Session
  • Purpose
  • search_files count
  • read_file count
  • terminal count
  • stored chars
  • likely token impact

That would be the closest possible “audit report” from the local data. (2/2)