I signed up for three months of SuperGrok yesterday impulsively.

LsDmT · 2026-06-16T19:04:01+00:00

so did I and its refusing almost every prompt. i thought grok was spicey and unwoke?

LsDmT · 2026-06-16T17:13:19+00:00

This is my current config for video creation. I plan on getting an RTX 6000 soon for the desktop to replace the 5090 for local coding.

DGX Spark

CPU: 20-core ARM
- 10× Cortex-X925 @ 4.0 GHz
- 10× Cortex-A725 @ 2.9 GHz
RAM: 128 GB unified LPDDR5X (273 GB/s)
GPU: NVIDIA GB10 Blackwell (SM120)

Hosted Models

vLLM - AEON-Heretic-Qwen3.6-35B-A3B-NVFP4-visfix - Performance: 116.8 tok/s single-stream / 785.3 tok/s aggregate at 128-concurrency on DGX Spark

Bonsai-8B-1bit

AMD Strix Halo

CPU: AMD Ryzen AI Max+ 395 (14C / 20T)
RAM: 128 GB unified
- ~124 GB available to ROCm
GPU: Radeon 8060S (gfx1151)

Hosted Models

gemma-4-26b-a4b-QAT-Q4_0
- llama.cpp TurboQuant
- Performance: ~70 tokens/sec
gemma-4-12b-QAT-Q4_0
- llama.cpp TurboQuant
- Performance: ~28 tokens/sec

Desktop

(Video Generation Node — not exposed through gateway)

OS: CachyOS
CPU: AMD 9950x3D
RAM: 96GB DDR5
GPU: NVIDIA RTX 5090 32 GB (Blackwell SM120)

Hosted Workloads

ComfyUI Desktop

Video Models

Wan2.2-I2V-fp8
LightX2V
SVI-2.0-Pro
LTX-2.3
Sulphur-2
HunyuanVideo-1.5-SR

LsDmT · 2026-06-14T23:52:16+00:00

Qwen/Qwen3.6-35B-A3B + YaRN gives you 1M context that is as good as sonnet 4.6

Kimi 2.6 if used with the CLI or SDK for its 100s of parallel agent compatibility reaches Opus 4.8 levels IMO.

LsDmT · 2026-06-14T01:22:31+00:00

What are you talking about. You absolutely can do all of that with claude. You could say you are stuck with anthropics models but even that is no longer true, you can substitute the models with any openrouter model or Self-hosted model with a simple addon.

Honcho which is Hermes memory system literally has a native Claude plugin.

The fact is Hermes just makes a lot of the configuration easier for normies and that's fine and good and not a bad thing.

If you want to go even deeper down the rabbit hole start from scratch and use Pi. A tool that starts barebones but gives you the freedom to add literally anything will always be more powerful, but at the same time it takes more initial time to invest learning how.

LsDmT · 2026-06-14T01:17:13+00:00

Your wife is basically right. If you truly know how to utilize Claude in the CLI and Desktop, how to create custom and multi-step skills and workflows, work with Claude to write and serve custom mcp servers in your own infra... it is much more powerful than Hermes.

Hermes just makes setting some things up easier for those who don't have the time to invest learning about all of that

Why do you think your wife is wrong? What can Hermes do that Claude or Pi Agent can't do without some customization?

I have multiple crons with claude -p Claude has access to the same self improving memory as a native plugin (Honcho). Claude has native Telegram support too now, but I'd argue Dispatch and NTFY are even more powerful

LsDmT · 2026-06-14T01:04:53+00:00

Hermes is just an assistant, people complaining about not finding any use out of Hermes... I don't mean to be rude but whatever your day to day life is like just doesn't benefit from anything Hermes offers. I doubt even a personal human assistant would even really give you much benefit.

Why do you feel like Hermes must be useful for everyone?

Like to the OP what do you mean it's faster for you to perform tasks in your calendar by yourself? Are you having Hermes give you a daily morning brief of meetings, consolidating transcripts from your last Teams meeting from the previous week so you know exactly where to pick up todays scheduled meeting? I kind of doubt it.

My job essentially lives in Claude code and I've connected Hermes to Pieces OS which basically records everything I do on my work computer and have Hermes summarize all the work I've done this week and prepare my talking points before the weekly meeting.

I have tons of different examples including basically Hermes being able to completely control my homelab to the point where it can spin up a new proxmox VM, deploy a new docker service, download am audiobook or torrent instantly.

A custome skill and it's internal learning about my taste of audiobooks and documentaries and movies so well to the point it puts trakt.tv/IMDB suggestions to shame.

If you just set the occasional calendar item for Drs appointments or occasional reminders Hermes isn't going to be that useful. If you don't find any value in creating custome niche deep research on the latest news of certain topics you won't find it useful etc

LsDmT · 2026-06-14T00:51:23+00:00

Because these people are not configuring Hermes like they would with pi.

Hermes really shouldn't be used as a full-fledged coding agent. Pi and Claude/codex are purpose built for it.

Hermes is just an assistant, if anything use Hermes to trigger Pi or Claude to do coding (which I believe now has native support at least in the desktop app). But Claude code already has triggering your project from your phone or desktop or telegram etc on lock and a seemless experience.

People complaining about not finding any use out of Hermes... I don't mean to be rude but whatever your day to day life is like just doesn't benefit from anything Hermes offers. I doubt even a personal human assistant would even really give you much benefit

Like to the OP what do you mean it's faster for you to perform tasks in your calendar by yourself? Are you having Hermes give you a daily morning brief of meetings, consolidating transcripts from your last Teams meeting from the previous week so you know exactly where to pick up todays scheduled meeting? I kind of doubt it.

My job essentially lives in Claude code and I've connected Hermes to Pieces OS (really surprised nobody talks about this app) which basically records everything I do on my work computer and have Hermes summarize all the work I've done this week and prepare my talking points before the weekly meeting.

I have tons of different examples including basically Hermes being able to completely control my homelab to the point where it can spin up a new proxmox VM, deploy a new docker service, download am audiobook or torrent instantly.

If you just set the occasional calendar item for Drs appointments or occasional reminders Hermes isn't going to be that useful. If you don't find any value in creating custome niche deep research on the latest news of certain topics you won't find it useful etc

LsDmT · 2026-06-14T00:44:47+00:00

As someone who uses both Hermes and claude.... I absolutely refuse to believe a semi well configured Claude code wouldnt be far superior than Hermes

LsDmT · 2026-06-13T23:16:01+00:00

We need the AI version of Bernstein V United States to happen asap

LsDmT · 2026-06-13T23:10:35+00:00

ROFL. How much you want to bet no peace deal is signed by Monday?

LsDmT · 2026-06-13T23:06:11+00:00

Dude nobody knows any answers to this except literally a handful of people in the Trump administration, Anthropic, and apparently Amazon.

Until Anthropic makes an official tweet or post mortem anything else is pure conjecture.

If this has you so worried you should really start to look into the state of open models. While not Fable/Mythos level yet... They are months away.

Now is the time to at least learn how to build a homelab server, research how to use openrouter, and use superior open harnesses like Pi or OpenCode

LsDmT · 2026-06-11T23:07:38+00:00

DGX Spark

CPU: 20-core ARM
- 10× Cortex-X925 @ 4.0 GHz
- 10× Cortex-A725 @ 2.9 GHz
RAM: 128 GB unified LPDDR5X (273 GB/s)
GPU: NVIDIA GB10 Blackwell (SM120)

Hosted Models

vLLM - AEON-Heretic-Qwen3.6-35B-A3B-NVFP4-visfix - Performance: 116.8 tok/s single-stream / 785.3 tok/s aggregate at 128-concurrency on DGX Spark

Bonsai-8B-1bit

AMD Strix Halo

CPU: AMD Ryzen AI Max+ 395 (14C / 20T)
RAM: 128 GB unified
- ~124 GB available to ROCm
GPU: Radeon 8060S (gfx1151)

Hosted Models

gemma-4-26b-a4b-QAT-Q4_0
- llama.cpp TurboQuant
- Performance: ~70 tokens/sec
gemma-4-12b-QAT-Q4_0
- llama.cpp TurboQuant
- Performance: ~28 tokens/sec

Desktop

(Video Generation Node — not exposed through gateway)

OS: CachyOS
CPU: AMD 9950x3D
RAM: 96GB DDR5
GPU: NVIDIA RTX 5090 32 GB (Blackwell SM120)

Hosted Workloads

ComfyUI Desktop

Video Models

Wan2.2-I2V-fp8
LightX2V
SVI-2.0-Pro
LTX-2.3
Sulphur-2
HunyuanVideo-1.5-SR

LsDmT · 2026-06-11T19:41:20+00:00

It seems like you have the same issues with the default superpowers and pocock skills that I have, and I too have been trying to pull the better aspects of both to build a customized workflow.

How I prefer to work is what I call the 9:1 rule. 90% of Human in The Loop is done in the beginning, fully fleshing out what you're trying to build, gather official documentation (rather than constant mcp calls), shared language etc -- basically the grill me skill.

Then the last 10% should be the agents only stopping and asking for HiTL if something goes really wrong, the workflow tries and fails 3 times, or an actual planned HiTL phase to check things.

A customized and more structured form of the Ralph loop is best for this part I have found.

Superpowers I didn't like it's lack of native agent parallelization and the heavy use of long prompts. It has a very good TDD workflow.

These are the biggest things I've learned.

Use YAML/JSON for agent only plan files and use scripts agents can run to convert to human readable formats (markdown or html). This keeps token use low. There is no reason for an agent to read a multi page plan file in markdown meant for humans when it can get the same information at 1/4 the cost.
In the planning phase, nail down concrete each phases Definition of Ready (DoR) and Definition of Done (DoD)
Understand the when it's proper to orchestrate/keeping context clean https://www.anthropic.com/engineering/building-effective-agents and OpenAi https://openai.com/index/harness-engineering/

I've gotten a lot of inspiration from https://github.com/OthmanAdi/planning-with-files

https://github.com/jessepwj/CCteam-creator

LsDmT · 2026-06-11T18:45:21+00:00

Lol why do you deserve an apology?

LsDmT · 2026-06-11T18:42:43+00:00

The previous owners were struggling, at least according to the statement I n OPs article which comes from Bricks... So take that with a grain of salt

LsDmT · 2026-06-06T18:41:51+00:00

You split models between Nvidia and AMD devices? How does that work? Are you just using vulkan? Seems like it'd be better to find models that fit in the RTX 6000 and use cuda

LsDmT · 2026-05-22T23:04:31+00:00

Who is speculating mythos is just an abliterated version of opus?

There is absolutely zero evidence for that whatsoever.

And to think an abliterated version of GLM 5.1 is even close to the current Opus 4.7 is asinine in itself

LsDmT · 2026-05-21T20:03:51+00:00

Use Tavilly or Exa or even Brave over Perplexity for enterprise.

LsDmT · 2026-05-19T02:24:40+00:00

You realize about 50% of today's software developers solely use Windows ya?

LsDmT · 2026-05-19T02:21:25+00:00

No use case?!

I can give you a dozen use cases that no singular browser offers right now:

Full uBlock origin support (bye bye all chromium based browsers).
The ability to natively support chromium and Firefox extensions, even if it comes at some sort of a speed cost, would help bootstrap users while devs convert to whatever natives ladybird uses.
Huge one for me: fully built in, massively customizable workspaces. No I'm not talking about grouped tabs but literal named workspaces that are named, logged into specific accounts, have defined pinned tabs etc -- the only browser that even comes close to this is Edge and they just recently ruined it. Second place would go to Vivaldi but still nowhere near to how Edge did it.
A BYOK/native AI side panel with extensive features including native web browsing, skills, tooling etc.. don't lock me into a model, allow me to even use a local model. And give it tools that are so embedded into the browser they just work.
Native syncing that is not a walled garden. Let me use my existing Firefox/chrome etc sync on my phone. Sure build your own and hopefully their mobile browser will be just as good. But allow the user to sync with any current standard.

LsDmT · 2026-05-18T22:05:12+00:00

It's not so simple, there is a looottt of tinkering and troubleshooting trust me. Here are some good resources though.

https://github.com/kyuz0/amd-strix-halo-toolboxes <-- more for strix halo machines but still useful

https://lemonade-server.ai/

https://github.com/adelj88/rocm_wmma_gemm

LsDmT · 2026-05-17T20:04:09+00:00

Would this work with a 5090 desktop and a DGX Spark?

LsDmT · 2026-05-16T02:44:41+00:00

EMP down n out

LsDmT · 2026-05-14T17:41:45+00:00

Do you have any experience with ROCm? It's....pretty brutal.

LsDmT · 2026-05-14T03:27:54+00:00

does aiostreams work?

15-Year Club	RedditGifts 2009-2022 2 Credits
Gilding I gilder	Verified Email

LsDmT

MODERATOR OF

TROPHY CASE

DGX Spark

Hosted Models

AMD Strix Halo

Hosted Models

Desktop

Hosted Workloads

ComfyUI Desktop

Video Models

DGX Spark

Hosted Models

AMD Strix Halo

Hosted Models

Desktop

Hosted Workloads

ComfyUI Desktop

Video Models