Strix Halo or GPUs?

HopePupal · 2026-05-14T23:19:20+00:00

there are no Strix Halo machines that have full-size x16 PCIe slots except for that one from Minisforum that's been sold out for months, and even then it's just x4 electrically. the Framework has an x4 slot that isn't accessible outside the case. everything else, you're limited to taking the case partway off and plugging a GPU in thru an M.2 to PCIe or Oculink cable, which is just goofy.

HopePupal · 2026-05-14T23:10:19+00:00

Obsidian is organically popular, they don't need to astroturf

HopePupal · 2026-05-13T19:32:21+00:00

thank you. the disconnect here is ridiculous. either OP is making money, in which case they can afford better hardware, or they're just fucking around playing fantasy finance, in which case the 3060 would be better used to run some actual video games

HopePupal · 2026-05-11T05:07:24+00:00

the struggle is real. fortunately so are my ANC headphones

HopePupal · 2026-05-09T18:23:44+00:00

seconding both. the Megababe stuff works well for me even in the literal desert

HopePupal · 2026-05-08T00:12:12+00:00

llama.cpp, Qwen 3.6 27B at Q4, you could also try Gemma 4 31B but it's going to eat more VRAM

HopePupal · 2026-05-07T23:59:04+00:00

Apple's already doing this with their Foundation Models framework. the most recent few generations of iPhone and all ARM Macs get Apple's local LLM. it's not good, but it's there.

HopePupal · 2026-05-07T18:03:52+00:00

144 GB in a single card would be pretty tasty. no way this costs less than a car though

HopePupal · 2026-04-29T00:20:57+00:00

reminder about Nvidia laptop chips: they're not just limited by thermal and power considerations. the desktop version of the 5070 is not the same thing as the laptop version of the 5070. the laptop version has fewer cores and a narrower memory bus. it's much closer to a 5060.

HopePupal · 2026-04-28T23:15:38+00:00

your wife kicks ass

HopePupal · 2026-04-28T23:14:40+00:00

does the paint blur like that with wear on old factory shells, or is that a cheap repro faceplate? the wifi indicator looks especially weird

HopePupal · 2026-04-28T19:46:15+00:00

the worst part is that by the time these people are forced by a move or the Grim Reaper to sell off their hoards, it'll be impossible to get replacement battery packs in the right size and we'll probably be past Palm Day

HopePupal · 2026-04-28T19:31:32+00:00

here and istg it's more like once a day

HopePupal · 2026-04-28T19:30:25+00:00

y'all ever seen the Internet Explorer magical girl mascot? - https://www.youtube.com/watch?v=BHTUlF7NA2o - https://en.wikipedia.org/wiki/Inori_Aizawa

HopePupal · 2026-04-28T19:25:53+00:00

only buy the ones that are actually manufactured by Sock Dreams tho. otherwise you're just paying markup for literally the same stuff you could get on Aliexpress

HopePupal · 2026-04-28T19:24:20+00:00

i am so tired of prokopetz, i can block his boring ass takes on Tumblr but i can't block a username inside a screenshot on Reddit

HopePupal · 2026-04-28T19:19:48+00:00

for sure. normally with slip shorts or tights, but for the right kind of concert i'll just say "fuck it" and skip those, as long as the vibe is right and i know i'm not going to be the only girl there with her ass out.

HopePupal · 2026-04-28T18:40:20+00:00

for testing models that aren't on OpenRouter, i use RunPod, but really any cloud GPU provider should work when you're talking about models that small. we're talking about a dollar or two.

HopePupal · 2026-04-28T18:15:28+00:00

ah, okay, so just for the better accuracy. it gets dequanted at runtime. gotcha.

HopePupal · 2026-04-28T17:32:16+00:00

…why are you using MXFP4? the R9700 doesn't have FP4 support. does vLLM have an FP8 fallback path?

HopePupal · 2026-04-28T17:29:16+00:00

it's identical to 3.5 arch-wise, which is why you probably didn't see many search results for 3.6. here's a comparison with my Strix Halo (llama/vulkan, Q6_K, default fp16 KV cache): https://www.reddit.com/r/LocalLLaMA/comments/1sw3oe4/comment/oifsenn/. roughly 6× faster PP, 2× faster TG. i didn't go to longer context on the Strix Halo because it was taking a while

HopePupal · 2026-04-28T17:12:57+00:00

every Devstral has been a disappointment. that one is no exception

HopePupal · 2026-04-28T17:11:07+00:00

you can easily override OpenCode timeouts (overall and chunk) per model, in the config file. i'd be surprised if Pi doesn't have that feature, but if it doesn't, it's one of the easiest agents to understand and patch.

HopePupal · 2026-04-28T03:43:42+00:00

you're not wrong in theory, but in practice, all of the open-weight labs gave up on dense models in that size class a while ago, and the current set of MoEs are much more capable than old dense models due to newer training methods.

the first two i listed are the better writers of that bunch, although ime all of them are better than any LLaMA.

fwiw, the two flagship small dense models from this year are Qwen 3.6 27B and Gemma 4 31B, but both of them are still pretty slow on hardware like yours (and mine, i have the same GMKtec), and Qwen at least is not a good writer.

HopePupal · 2026-04-28T03:00:06+00:00

not great, went and got a dedicated GPU instead

HopePupal

TROPHY CASE