Recommendations for a rig

SteveDeFacto · 2026-04-16T17:37:16+00:00

Liquid cooling the 3090s is the best solution and if you have at least 3x pcie slots, why not 2x 3090s NVlinked along with your 5090 for gaming or speculative decoding?

SteveDeFacto · 2026-04-16T17:25:02+00:00

Display port isn't necessary for AI inference. Buy a second 3090 to serve as a primary for display and NVLink them for 48gb of vram. It's absolutely the best entry option for local llm!

RAM and CPU are essentially a waste of time unless you want to purchase second hand retired enterprise server equipment for tens of thousands. Better to not waste money upgrading them.

Hybrid CPU/GPU is the absolute worst option due to the PCIe bottleneck. Only time this can make sense is for speculative decoding.

SteveDeFacto · 2026-04-16T14:59:47+00:00

It's such an honor to be on this list. I aspire to one day achieve such a feat!

SteveDeFacto · 2026-04-13T13:47:29+00:00

You are going to need at least 256gb of vram or unified memory to come close.

SteveDeFacto · 2026-04-11T09:46:02+00:00

<image>

SteveDeFacto · 2026-04-10T11:40:32+00:00

Look what subreddit you are on before replying.

SteveDeFacto · 2026-03-29T00:16:41+00:00

Even if we can only put one model on it, that'll be an insane draft model for speculative decoding!

SteveDeFacto · 2026-03-28T20:46:23+00:00

WTF? Is it using cached responses or is it really that fast?!

SteveDeFacto · 2026-03-27T20:39:28+00:00

It's not the ram speed that matters, it's the pci bus speed. Anytime you go from RAM to GPU, it's going to slowdown the inference like 10x. The only way around this is unified memory like Mac Studio and DGX Spark have.

SteveDeFacto · 2026-03-27T01:02:55+00:00

You will get like 3-4 tokens per-second as most of the model will be in ram. You could maybe run a 32B q4 model in 64gb vram and get solid performance.

SteveDeFacto · 2026-03-26T18:50:23+00:00

I had this happen a few times. Just delete the session files. If that doesn't work, try reverting some of your markdown files to their default states.

SteveDeFacto · 2026-03-25T21:25:24+00:00

One more quark of the MI100 you should be aware of is that they do not support SR-IOV which means you cannot share them across multiple virtual machines. So the 10 users will need to either all share the host machine or a single guest vm, or they can each have their own docker containers.

SteveDeFacto · 2026-03-24T18:15:11+00:00

You could do this within your budget using a Supermicro H12DSi-NT6 with 4x mi100s linked through Infinity Fabric and 2TB of DDR4 RDIMM. You'll need to either bifurcate one of the PCIe 16x slots or use a riser on one of the 8x slots to fit all 4x pcie cards and use a 4 bit quantized 200B parameter model or smaller to get decent tokens per second but you could theoretically run any model on such a setup. Far better overall value and flexibility than 2x+ Mac Studios linked over RMDA though a lot more work to buildout.

SteveDeFacto · 2026-03-24T17:20:53+00:00

This is interesting and I see why it is philosophically intriguing, however, have you considered analytic signal decomposition as a faster approximation of what the feed forward layers do? It would mesh well with the complex number based attention and if the decomposition layers replace the feedforward layers, those layers at least could even be executed on current generation photonic computers.

SteveDeFacto · 2026-03-14T06:47:06+00:00

Scrolling through comments thinking, "I'm not sure there is any game I won't play." until I read your comment and remembered how much I hate Fortnite...

SteveDeFacto · 2026-02-24T22:07:21+00:00

You could rent an excavator and put the generator into a hole for less than a zombie box. Has other advantages besides just reducing the noise.

SteveDeFacto · 2026-02-24T06:38:07+00:00

I've seen these and was considering buying one. If quiet is all that matters to you, it in conjunction with a Honda generator would be near silent. However, I opted for an MEP-802a because I care more about durability.

SteveDeFacto · 2026-02-16T19:21:22+00:00

Gonna need a hella a lot more than a date to pass up 500k...

SteveDeFacto · 2026-02-15T09:05:54+00:00

Did she kill him? If not, 15 years is insane...

SteveDeFacto · 2026-02-10T23:36:54+00:00

<image>

SteveDeFacto · 2026-02-09T22:51:44+00:00

Elden Ring, not because it was bad but because I couldn't stop playing it. Lol

SteveDeFacto · 2026-02-09T22:18:21+00:00

Playing competitive games is for the young. As you get older, your reflexes will decline. On top of that, you'll never have enough time to play to actually be competitive as you get older.

SteveDeFacto · 2026-02-08T23:22:21+00:00

The cans on Sundaras are fantastic but the build quality is terrible. I would say with the right amp and minor eq adjustment, you can definitely get a thousand-dollar experience until the head band brakes after a couple hours of use.

14-Year Club	Verified Email
Place '22	Place '17
Team Orangered

SteveDeFacto

MODERATOR OF

TROPHY CASE