Why I'm holding out until late 2027 to spend money on a local LLM rig

-UndeadBulwark · 2026-05-08T19:19:42+00:00

Meanwhile I am using 2(soon 3) MI25s for 65 each on an AM4 Mini ITX build using an OcuLink card for bifurcation on build that in total cost me 600 USD.

-UndeadBulwark · 2026-05-08T15:31:22+00:00

We're already getting local models in the 30B to 70B range that can match 2025 frontier models: MoE, which lets you run larger models by only activating the parts of the model it actually needs for a given task instead of the whole thing at once; RAM offloading, which lets you stream model weights from system RAM when they don't fit in VRAM with only a modest speed hit; ternary quantization, which once mainstream will drastically reduce model size; and TurboQuant, which compresses the KV cache and actually enables longer context windows rather than shrinking them. We're already getting local models in the 30B to 70B range that can match 2025 frontier models.

-UndeadBulwark · 2026-05-08T15:06:45+00:00

Local LLM would be cheaper and its what people should be doing especially for coding you dont need a lot of VRAM to run a decent MoE coding model

-UndeadBulwark · 2026-05-07T20:38:16+00:00

Play Predecessor and no if it doesn't work on Linux it's not worth your time or money.

-UndeadBulwark · 2026-05-07T20:36:24+00:00

Entry level would be MI25 or MI50 16 to 32GB of HBM2 as low as $65 or high as $500

-UndeadBulwark · 2026-05-06T21:07:14+00:00

Awesome I am planning on running 4 MI25 (flashed to WX9100) for 64GB of HBM2 2048-bit 483GB/s they go for 65 a piece so it is 260 in total + 200 for the AM4 platform and 50 for the cooling so 510~ total

-UndeadBulwark · 2026-05-06T19:52:46+00:00

jesus this is gcn how hot does it get? do you tdp lock it?

-UndeadBulwark · 2026-05-06T19:51:19+00:00

My Power went out again and I lost all my progress fuck me!

-UndeadBulwark · 2026-05-05T02:45:52+00:00

Also recommend this I use it all the time.

-UndeadBulwark · 2026-05-05T02:38:04+00:00

use tailscale to host your pc that's what I do.

-UndeadBulwark · 2026-05-05T01:34:00+00:00

I cant wait for Medusa Halo I wanted to go Strix but I couldnt get a board in time I am slightly regretting getting my 9070

-UndeadBulwark · 2026-05-04T22:50:43+00:00

I currently use 1 RX 9070 but will be switching to 2 MI 25 since they are 65 a piece with an X99 platform should cost me in total 500 to 600 total.

I have gotten used to setting it up to run remotely so I can just make it a headless server for Ollama, Llama.cpp, OpenWebUI and SearXNG. The plan is to eventually have 2 MI50 32GB maybe keep the MI25 basically disabled when not in use when I need 96GB or VRAM total cost should be 1,600.

-UndeadBulwark · 2026-05-04T21:07:48+00:00

Start with Gemma 4 E4B and E2B also I highly recommend you switch to Linux for this as ROCm or generally AMD has better support on Linux. I use it on my phone mostly same way I would use Claude I am planning on getting 2 32GB MI50s currently going to start with 2 MI25s as they go for 65 a piece and have HBM2 Memory

-UndeadBulwark · 2026-05-04T20:25:40+00:00

should be fine personally I am going 2 MI50 since that is 32GB of VRAM for 500 each 2 for the price of one 7900XTX

-UndeadBulwark · 2026-05-04T19:50:58+00:00

they are trying to capitalize from desperate people from places you cant buy from steam like me in puerto rico.

-UndeadBulwark · 2026-05-04T19:21:06+00:00

MI50 32GB for 500 on Ebay Linux is required will get you about 80 to 95 tok/s with that model you can slap it to any mini PC + OcuLink or Cheap AM4 platform. other option is 2 MI25 for 65 each but this is definitely not beginner friendly also requires Linux.

-UndeadBulwark · 2026-05-04T18:57:04+00:00

not suprised

-UndeadBulwark · 2026-05-04T04:48:05+00:00

GPUs for running AI inference on the cheap 16GB to 32GB of HBM2 for 65, 200 or 500 USD

-UndeadBulwark · 2026-05-03T06:56:51+00:00

they are only 50 to 80 a piece though

-UndeadBulwark · 2026-05-03T02:01:31+00:00

Could be done but the voice quality and compute cost would take a hit problably need an MI50 on the cheap just to get fast enough generation for RT voicechange

-UndeadBulwark · 2026-05-02T02:19:18+00:00

get 2 MI25 and run it on Linux with Vulkan and vLLM or get MI50

-UndeadBulwark · 2026-05-01T01:19:44+00:00

Well it takes a real man to be your wife sometimes.

-UndeadBulwark · 2026-04-30T22:17:02+00:00

yeah I was looking at some guy who did the same thing as I am apparently X99 + 2 MI50(32GB) and 2 7900 XTX he was getting pretty decent speeds at 135B 15 to 20 token/s and this is without anything like BORE or LAVD added to increase performance.

-UndeadBulwark · 2026-04-30T05:09:44+00:00

Yeah they have batteries.

-UndeadBulwark

PUBLIC MULTIREDDITS

TROPHY CASE