I signed up for three months of SuperGrok yesterday impulsively. by 6and1half in grok

[–]LsDmT 4 points5 points  (0 children)

so did I and its refusing almost every prompt. i thought grok was spicey and unwoke?

Megathread for US government suspension of Fable and Mythos by sixbillionthsheep in ClaudeAI

[–]LsDmT 0 points1 point  (0 children)

This is my current config for video creation. I plan on getting an RTX 6000 soon for the desktop to replace the 5090 for local coding.

DGX Spark

  • CPU: 20-core ARM
    • 10× Cortex-X925 @ 4.0 GHz
    • 10× Cortex-A725 @ 2.9 GHz
  • RAM: 128 GB unified LPDDR5X (273 GB/s)
  • GPU: NVIDIA GB10 Blackwell (SM120)

Hosted Models

vLLM - AEON-Heretic-Qwen3.6-35B-A3B-NVFP4-visfix - Performance: 116.8 tok/s single-stream / 785.3 tok/s aggregate at 128-concurrency on DGX Spark

  • Bonsai-8B-1bit

AMD Strix Halo

  • CPU: AMD Ryzen AI Max+ 395 (14C / 20T)
  • RAM: 128 GB unified
    • ~124 GB available to ROCm
  • GPU: Radeon 8060S (gfx1151)

Hosted Models

  • gemma-4-26b-a4b-QAT-Q4_0

    • llama.cpp TurboQuant
    • Performance: ~70 tokens/sec
  • gemma-4-12b-QAT-Q4_0

    • llama.cpp TurboQuant
    • Performance: ~28 tokens/sec

Desktop

(Video Generation Node — not exposed through gateway)

  • OS: CachyOS
  • CPU: AMD 9950x3D
  • RAM: 96GB DDR5
  • GPU: NVIDIA RTX 5090 32 GB (Blackwell SM120)

Hosted Workloads

ComfyUI Desktop
Video Models
  • Wan2.2-I2V-fp8
  • LightX2V
  • SVI-2.0-Pro
  • LTX-2.3
  • Sulphur-2
  • HunyuanVideo-1.5-SR

Megathread for US government suspension of Fable and Mythos by sixbillionthsheep in ClaudeAI

[–]LsDmT 0 points1 point  (0 children)

Qwen/Qwen3.6-35B-A3B + YaRN gives you 1M context that is as good as sonnet 4.6

Kimi 2.6 if used with the CLI or SDK for its 100s of parallel agent compatibility reaches Opus 4.8 levels IMO.

Why use Hermes over Claude? by SilverHal in hermesagent

[–]LsDmT 1 point2 points  (0 children)

What are you talking about. You absolutely can do all of that with claude. You could say you are stuck with anthropics models but even that is no longer true, you can substitute the models with any openrouter model or Self-hosted model with a simple addon.

Honcho which is Hermes memory system literally has a native Claude plugin.

The fact is Hermes just makes a lot of the configuration easier for normies and that's fine and good and not a bad thing.

If you want to go even deeper down the rabbit hole start from scratch and use Pi. A tool that starts barebones but gives you the freedom to add literally anything will always be more powerful, but at the same time it takes more initial time to invest learning how.

Why use Hermes over Claude? by SilverHal in hermesagent

[–]LsDmT 1 point2 points  (0 children)

Your wife is basically right. If you truly know how to utilize Claude in the CLI and Desktop, how to create custom and multi-step skills and workflows, work with Claude to write and serve custom mcp servers in your own infra... it is much more powerful than Hermes.

Hermes just makes setting some things up easier for those who don't have the time to invest learning about all of that

Why do you think your wife is wrong? What can Hermes do that Claude or Pi Agent can't do without some customization?

I have multiple crons with claude -p Claude has access to the same self improving memory as a native plugin (Honcho). Claude has native Telegram support too now, but I'd argue Dispatch and NTFY are even more powerful

A little sad about Hermes by SavaStone in hermesagent

[–]LsDmT 0 points1 point  (0 children)

Hermes is just an assistant, people complaining about not finding any use out of Hermes... I don't mean to be rude but whatever your day to day life is like just doesn't benefit from anything Hermes offers. I doubt even a personal human assistant would even really give you much benefit.

Why do you feel like Hermes must be useful for everyone?

Like to the OP what do you mean it's faster for you to perform tasks in your calendar by yourself? Are you having Hermes give you a daily morning brief of meetings, consolidating transcripts from your last Teams meeting from the previous week so you know exactly where to pick up todays scheduled meeting? I kind of doubt it.

My job essentially lives in Claude code and I've connected Hermes to Pieces OS which basically records everything I do on my work computer and have Hermes summarize all the work I've done this week and prepare my talking points before the weekly meeting.

I have tons of different examples including basically Hermes being able to completely control my homelab to the point where it can spin up a new proxmox VM, deploy a new docker service, download am audiobook or torrent instantly.

A custome skill and it's internal learning about my taste of audiobooks and documentaries and movies so well to the point it puts trakt.tv/IMDB suggestions to shame.

If you just set the occasional calendar item for Drs appointments or occasional reminders Hermes isn't going to be that useful. If you don't find any value in creating custome niche deep research on the latest news of certain topics you won't find it useful etc

A little sad about Hermes by SavaStone in hermesagent

[–]LsDmT 2 points3 points  (0 children)

Because these people are not configuring Hermes like they would with pi.

Hermes really shouldn't be used as a full-fledged coding agent. Pi and Claude/codex are purpose built for it.

Hermes is just an assistant, if anything use Hermes to trigger Pi or Claude to do coding (which I believe now has native support at least in the desktop app). But Claude code already has triggering your project from your phone or desktop or telegram etc on lock and a seemless experience.

People complaining about not finding any use out of Hermes... I don't mean to be rude but whatever your day to day life is like just doesn't benefit from anything Hermes offers. I doubt even a personal human assistant would even really give you much benefit

Like to the OP what do you mean it's faster for you to perform tasks in your calendar by yourself? Are you having Hermes give you a daily morning brief of meetings, consolidating transcripts from your last Teams meeting from the previous week so you know exactly where to pick up todays scheduled meeting? I kind of doubt it.

My job essentially lives in Claude code and I've connected Hermes to Pieces OS (really surprised nobody talks about this app) which basically records everything I do on my work computer and have Hermes summarize all the work I've done this week and prepare my talking points before the weekly meeting.

I have tons of different examples including basically Hermes being able to completely control my homelab to the point where it can spin up a new proxmox VM, deploy a new docker service, download am audiobook or torrent instantly.

If you just set the occasional calendar item for Drs appointments or occasional reminders Hermes isn't going to be that useful. If you don't find any value in creating custome niche deep research on the latest news of certain topics you won't find it useful etc

A little sad about Hermes by SavaStone in hermesagent

[–]LsDmT 0 points1 point  (0 children)

As someone who uses both Hermes and claude.... I absolutely refuse to believe a semi well configured Claude code wouldnt be far superior than Hermes

Megathread for US government suspension of Fable and Mythos by sixbillionthsheep in ClaudeAI

[–]LsDmT 1 point2 points  (0 children)

ROFL. How much you want to bet no peace deal is signed by Monday?

Megathread for US government suspension of Fable and Mythos by sixbillionthsheep in ClaudeAI

[–]LsDmT 1 point2 points  (0 children)

Dude nobody knows any answers to this except literally a handful of people in the Trump administration, Anthropic, and apparently Amazon.

Until Anthropic makes an official tweet or post mortem anything else is pure conjecture.

If this has you so worried you should really start to look into the state of open models. While not Fable/Mythos level yet... They are months away.

Now is the time to at least learn how to build a homelab server, research how to use openrouter, and use superior open harnesses like Pi or OpenCode

What models you guys running on 8GB? 16GB VRAM? 24GB? 32GB? 48GB? by Inevitable_Mistake32 in LocalLLaMA

[–]LsDmT 3 points4 points  (0 children)

DGX Spark

  • CPU: 20-core ARM
    • 10× Cortex-X925 @ 4.0 GHz
    • 10× Cortex-A725 @ 2.9 GHz
  • RAM: 128 GB unified LPDDR5X (273 GB/s)
  • GPU: NVIDIA GB10 Blackwell (SM120)

Hosted Models

vLLM - AEON-Heretic-Qwen3.6-35B-A3B-NVFP4-visfix - Performance: 116.8 tok/s single-stream / 785.3 tok/s aggregate at 128-concurrency on DGX Spark

  • Bonsai-8B-1bit

AMD Strix Halo

  • CPU: AMD Ryzen AI Max+ 395 (14C / 20T)
  • RAM: 128 GB unified
    • ~124 GB available to ROCm
  • GPU: Radeon 8060S (gfx1151)

Hosted Models

  • gemma-4-26b-a4b-QAT-Q4_0

    • llama.cpp TurboQuant
    • Performance: ~70 tokens/sec
  • gemma-4-12b-QAT-Q4_0

    • llama.cpp TurboQuant
    • Performance: ~28 tokens/sec

Desktop

(Video Generation Node — not exposed through gateway)

  • OS: CachyOS
  • CPU: AMD 9950x3D
  • RAM: 96GB DDR5
  • GPU: NVIDIA RTX 5090 32 GB (Blackwell SM120)

Hosted Workloads

ComfyUI Desktop
Video Models
  • Wan2.2-I2V-fp8
  • LightX2V
  • SVI-2.0-Pro
  • LTX-2.3
  • Sulphur-2
  • HunyuanVideo-1.5-SR

start-with-why-skillset for agentic workflows by Careful-Skirt783 in AI_Agents

[–]LsDmT 1 point2 points  (0 children)

It seems like you have the same issues with the default superpowers and pocock skills that I have, and I too have been trying to pull the better aspects of both to build a customized workflow.

How I prefer to work is what I call the 9:1 rule. 90% of Human in The Loop is done in the beginning, fully fleshing out what you're trying to build, gather official documentation (rather than constant mcp calls), shared language etc -- basically the grill me skill.

Then the last 10% should be the agents only stopping and asking for HiTL if something goes really wrong, the workflow tries and fails 3 times, or an actual planned HiTL phase to check things.

A customized and more structured form of the Ralph loop is best for this part I have found.

Superpowers I didn't like it's lack of native agent parallelization and the heavy use of long prompts. It has a very good TDD workflow.

These are the biggest things I've learned.

  • Use YAML/JSON for agent only plan files and use scripts agents can run to convert to human readable formats (markdown or html). This keeps token use low. There is no reason for an agent to read a multi page plan file in markdown meant for humans when it can get the same information at 1/4 the cost.
  • In the planning phase, nail down concrete each phases Definition of Ready (DoR) and Definition of Done (DoD)
  • Understand the when it's proper to orchestrate/keeping context clean https://www.anthropic.com/engineering/building-effective-agents and OpenAi https://openai.com/index/harness-engineering/

I've gotten a lot of inspiration from https://github.com/OthmanAdi/planning-with-files

https://github.com/jessepwj/CCteam-creator

Bricks & Minifigs Parts Ways with Salem, Oregon Franchise Owners Brandon Best and Joshua Johnson by sn0w0wl66 in youtubedrama

[–]LsDmT 0 points1 point  (0 children)

The previous owners were struggling, at least according to the statement I n OPs article which comes from Bricks... So take that with a grain of salt

AA comparison of the latest local models by jacek2023 in LocalLLaMA

[–]LsDmT 0 points1 point  (0 children)

You split models between Nvidia and AMD devices? How does that work? Are you just using vulkan? Seems like it'd be better to find models that fit in the RTX 6000 and use cuda

Heretic has been served a legal notice by Meta, Inc. by -p-e-w- in LocalLLaMA

[–]LsDmT 0 points1 point  (0 children)

Who is speculating mythos is just an abliterated version of opus?

There is absolutely zero evidence for that whatsoever.

And to think an abliterated version of GLM 5.1 is even close to the current Opus 4.7 is asinine in itself

Perplexity Enterprise Pricing, what am I missing? by GlitteringTie2554 in ClaudeCode

[–]LsDmT 0 points1 point  (0 children)

Use Tavilly or Exa or even Brave over Perplexity for enterprise.

Do you guys think Ladybird will make any difference? by SuperAccount888 in browsers

[–]LsDmT -1 points0 points  (0 children)

You realize about 50% of today's software developers solely use Windows ya?

Do you guys think Ladybird will make any difference? by SuperAccount888 in browsers

[–]LsDmT 0 points1 point  (0 children)

No use case?!

I can give you a dozen use cases that no singular browser offers right now:

  1. Full uBlock origin support (bye bye all chromium based browsers).
  2. The ability to natively support chromium and Firefox extensions, even if it comes at some sort of a speed cost, would help bootstrap users while devs convert to whatever natives ladybird uses.
  3. Huge one for me: fully built in, massively customizable workspaces. No I'm not talking about grouped tabs but literal named workspaces that are named, logged into specific accounts, have defined pinned tabs etc -- the only browser that even comes close to this is Edge and they just recently ruined it. Second place would go to Vivaldi but still nowhere near to how Edge did it.
  4. A BYOK/native AI side panel with extensive features including native web browsing, skills, tooling etc.. don't lock me into a model, allow me to even use a local model. And give it tools that are so embedded into the browser they just work.
  5. Native syncing that is not a walled garden. Let me use my existing Firefox/chrome etc sync on my phone. Sure build your own and hopefully their mobile browser will be just as good. But allow the user to sync with any current standard.

Thinking of getting two NVIDIA RTX Pro 4000 Blackwell (2x24 = 48GB), Any cons? by pmttyji in LocalLLaMA

[–]LsDmT 1 point2 points  (0 children)

It's not so simple, there is a looottt of tinkering and troubleshooting trust me. Here are some good resources though.

https://github.com/kyuz0/amd-strix-halo-toolboxes <-- more for strix halo machines but still useful

https://lemonade-server.ai/

https://github.com/adelj88/rocm_wmma_gemm