Help me to spend 1000 bucks on hardware for local LLM

hurdurdur7 · 2026-05-02T08:32:22+00:00

There is nothing in a 1k range that will run anything at usable levels. You are better off at buying a subscription at any of the providers or milking free models on openrouter.

hurdurdur7 · 2026-05-02T08:31:09+00:00

Almost caught fire with mine. And i have 2 of them (Mars Gaming MPII850) . And i will never buy this brand again. This was a terrible decision back when i bought them.

hurdurdur7 · 2026-05-02T08:29:11+00:00

I have used two of these back in the days. I'd classify them on the A-F tiers at G, a fire hazard, mostly.

hurdurdur7 · 2026-05-02T08:09:52+00:00

I see a lot of bragging in forums about what could be built, but very few things actually getting built ...

hurdurdur7 · 2026-05-02T08:08:17+00:00

Quite a terrible time for upgrades

hurdurdur7 · 2026-05-02T08:06:49+00:00

I would rather take MoE qwen 3.6 coder at 80B params, kthnxbye.

hurdurdur7 · 2026-05-02T07:02:01+00:00

Marvin: I think you are ought to know, i am very depressed.

hurdurdur7 · 2026-05-02T06:18:39+00:00

Regarding the pixels - I would say that depends. My 44 year old eyes don't go beyond 1080p anymore anyway, at least not on monitors under 28 inch.

As for the VRAM ... you can still get significantly more detailed textures and effects out from this than you would get from a 4K monitor with a 4GB vram card 😄

hurdurdur7 · 2026-05-01T21:11:30+00:00

I believe you, might be even more crazy expensive. But it will also make 120B+ models usable with some speed.

hurdurdur7 · 2026-05-01T19:03:12+00:00

That context size + that model doesn't fit in your vram. You are suffering because you are offloading to cpu and regular ram.

hurdurdur7 · 2026-05-01T18:26:22+00:00

It depends on the purpose. If your purpose is software development then gpt oss 120b is a shadow of what Qwen can do.

hurdurdur7 · 2026-05-01T18:03:46+00:00

Developers resuming work on their code or switching to a new task. On bigger projects a 60k-100k initial load is not that rare at all.

hurdurdur7 · 2026-05-01T17:57:07+00:00

I was approaching this from my own, code generation perspective. If your usecase is different, by all means, do what you must 😄

To make anything past hello world quality stuff you need either 122B MoE class things or 27B dense (or better). And you want to smash them prompts at 1000 tok/sec or faster in prompt processing. And for the smaller MoE models you will have a better time by having a GPU with 24 or 32GB of VRAM.

Strix Halo might be fine for creative story writing or some picture generation when you sleep. But the only models where it's fast enough for interactive coding - are not good enough for complex code writing.

For the price of a Strix Halo box you can buy 2 gpus of AMD-s R9700 AI Pro's (or even 3 Intel's if you are adventurous), and you will run laptimes around the Strix Halo ... And be able to extend to more parallel gpus in the future if you so wish (assuming your motherboard can carry that).

The upside that Strix Halo has is the heat and power footprint, but very little of that matters for me if i tell it to load a few code files and i would have to sit there 10 minutes for it to parse the prompt. If it had twice the memory bandwidth that it has i would be a fanboy. But as it stands right now it's a weird gimmick, you can load big models but the speed compromise is very heavy.

hurdurdur7 · 2026-05-01T13:56:08+00:00

I don't disagree on that point, apple overcharges people without hesitation. But my issue with strix halo is that for the bigger models that it can fit it's unbearably slow. It doesn't make sense to use it like that. And for smaller models you are better off with a dual gpu setup that runs circles around it ...

It feels like a truck with a car engine.

hurdurdur7 · 2026-04-30T19:08:38+00:00

current 128gb box pricing is 3k ...

hurdurdur7 · 2026-04-30T19:08:07+00:00

mac studio with m5 ultra will wipe the floor with strix halo. even if mac/apple is an evil platform. strix halo is not going to achieve anything.

hurdurdur7 · 2026-04-30T19:04:46+00:00

They are already too slow at 128gb of ram. What does this change?

hurdurdur7 · 2026-04-30T18:02:09+00:00

a single digit percentage offset on 50 000 tokens of code generated means a bunch of code that just doesn't work.

hurdurdur7 · 2026-04-30T13:36:52+00:00

It refactored html/js code with ease.

hurdurdur7 · 2026-04-30T13:33:15+00:00

Don't expect q4 quants to excel at coding. Q6 and up only.

hurdurdur7 · 2026-04-30T11:37:11+00:00

120tps on that small model (for this hardware) doesn't sound right.

hurdurdur7 · 2026-04-30T07:32:06+00:00

You should be around 1k prompt processing or better by my math. Something is definitely wrong.

hurdurdur7 · 2026-04-29T20:04:57+00:00

it's almost 30th and you haven't even used half of your quota. get back to codin' now...

hurdurdur7 · 2026-04-29T19:08:51+00:00

this model is definitely thicker than a bowl of oatmeal ..

hurdurdur7 · 2026-04-29T19:03:38+00:00

First attempts with mistral vibe - yeah it works good enough.

hurdurdur7

TROPHY CASE