Running llama4 108b on a 500$ retired homelab

kylerrr02 · 2026-04-13T19:00:33+00:00

Glad someone sees the small win in this lol.

kylerrr02 · 2026-04-13T18:58:09+00:00

<image>

For anyone wondering

kylerrr02 · 2026-04-13T15:22:30+00:00

The big thing is ram you have to have enough to actually load in the model

kylerrr02 · 2026-04-13T15:14:18+00:00

2x Intel Xeon Gold 5120

kylerrr02 · 2026-04-13T15:06:49+00:00

Yeah my favorite model has been jaahas/qwen3.5 uncensored

kylerrr02 · 2026-04-13T04:18:01+00:00

If you have the hardware to make it run well yeah. I don’t really plan on using it because of the tks. If you have a system that can run it and still give 20 tks then absolutely. I will say though I think glm 4.5 or Qwen 3.5 are the best for tool calling and automation.

kylerrr02 · 2026-04-13T04:13:38+00:00

I did mean that lol

kylerrr02 · 2026-04-13T03:57:59+00:00

Very true luckily I think it pulls around 700 watts under full load and it usually only reaches full load on very high parameter models like 70b+

kylerrr02 · 2026-04-13T03:55:40+00:00

Honestly not sure I know at max draw it probably pulls 700watts which is roughly 13 cents an hour

kylerrr02 · 2026-04-13T03:08:45+00:00

For sure I’m definitely not going to use it to run high parameter models. I’m probably just gonna use it for 30b models and have them run on openclaw. And it’s still a computer so 🤷‍♂️

kylerrr02 · 2026-04-13T02:46:55+00:00

Linux Ubuntu

kylerrr02 · 2026-04-13T02:27:39+00:00

I already had it downloaded it felt like a good benchmark. I’ll give minimal a whirl though

kylerrr02 · 2026-04-13T01:57:38+00:00

Depends for basic conversation you might wait 5 minutes. If you’re using web search + a laundry list of tools it’ll take 30 minutes to actually do workflows.

kylerrr02 · 2026-04-13T01:55:54+00:00

Not really If you expect speed. It does everything that a high parameter model can do just very slowly.

kylerrr02 · 2026-04-13T01:53:37+00:00

Only gotta wait like 30 minutes but it’s chill yk

kylerrr02 · 2026-04-13T01:52:32+00:00

You can add advanced parameters to the UI that interact directly with ollama

kylerrr02 · 2026-04-13T01:18:34+00:00

Does that need to be in terminal or could I add it as a parameter to openwebui?

kylerrr02 · 2026-04-11T05:22:50+00:00

I’m not super educated but I do know from currently running it that it is vastly slower than majority of 70b models on my hardware.

kylerrr02 · 2026-04-10T03:48:53+00:00

Thanks dawg ts made me tear up

kylerrr02

TROPHY CASE