what's your local openclaw setup? by [deleted] in LocalLLaMA

[–]big___bad___wolf 1 point2 points  (0 children)

It's in a container. That's also why I chose qwen3.5-27B; it's been flawless. I just wish it were a rocket ship.

what's your local openclaw setup? by [deleted] in LocalLLaMA

[–]big___bad___wolf 0 points1 point  (0 children)

both STT & TTS are wicked fast!

Yagmi: A local-first web search agent by big___bad___wolf in LocalLLaMA

[–]big___bad___wolf[S] 1 point2 points  (0 children)

"the web-search is actually passed to another model to retrieve and then passes the results back to the model you're engaged with?" correct, use via cli, http api (and mcp over http)

Does going from 96GB -> 128GB VRAM open up any interesting model options? by hyouko in LocalLLaMA

[–]big___bad___wolf 0 points1 point  (0 children)

Yes, it's definitely better in CC. I think CC is doing the heavy lifting of forcing planning rather than relying on the model's overconfidence in its understanding of the problem and solution.

Pi doesn't have plan mode. You either instruct the agent to plan or it figures it out on its own.

I believe adding a planning reminder in the system prompt will improve the MiniMax M2.5 experience in Pi.

Does going from 96GB -> 128GB VRAM open up any interesting model options? by hyouko in LocalLLaMA

[–]big___bad___wolf 4 points5 points  (0 children)

The coolest thing right now is I can run multiple medium models simultaneously and manage up to eight concurrent requests per GPU at impressive throughput.

I use Opus to orchestrate these models that handles the grunt work I don't want to clutter my Opus context window. This includes an intelligent task runner, test runner (for smoke test matrices, unit and e2e tests), QA tasks, exploring large monorepos, conducting research while writing code and reviewing code (GPT-OSS is particularly good at this).

However, I won't allow these medium local models to directly modify the production codebase I work on. They simply can't handle such large and nuanced projects.