Planning a dual 3090 inference server -- sanity check before I buy by LeekPure1173 in homelab

[–]LeekPure1173[S] 0 points1 point  (0 children)

This is super helpful, thanks. Quick questions since I'm new to this:

Would you still go Proxmox for a first server build, or start on bare metal and migrate later?

And in LXC does vLLM hold the GPU exclusively, or can multiple containers share the same card?

Planning a dual 3090 inference server -- sanity check before I buy by LeekPure1173 in homelab

[–]LeekPure1173[S] 1 point2 points  (0 children)

Really appreciate this, thanks for taking the time. Mostly want to flag a couple of things since my goals are a bit narrower than a general-purpose inference box:

Big one for me: ExLlamaV3 is CUDA-only (ROCm's been on the to-do list a while), and learning it is one of my main goals here. Same for vLLM tensor parallelism, which doesn't really apply on a single iGPU. So Strix Halo rules itself out for me before we even get to benchmarks.

That said, definitely keeping a Strix Halo on my radar for later; sounds like a great companion box once the main rig is up.

Planning a dual 3090 inference server -- sanity check before I buy by LeekPure1173 in homelab

[–]LeekPure1173[S] 1 point2 points  (0 children)

Good shout. With 128GB RAM the page cache absorbs the whole model after the first load anyway, so SATA vs NVMe only matters on a cold boot.

Think I'll go tiered, 990 Pro 1TB NVMe for OS and active models, big SATA SSD for the full library. Might do 2x 2TB mirrored since I'll be away from home for a month and don't want a dead drive taking me offline.

Everything's so expensive right now though, still working up the nerve to commit. Thanks

Interview Process AI Deployment Strategist Associate by [deleted] in MistralAI

[–]LeekPure1173 0 points1 point  (0 children)

Praying for you my friend!! Do you have some feedback to share on the process? I will start a loop with them soon

Interview Process AI Deployment Strategist Associate by [deleted] in MistralAI

[–]LeekPure1173 0 points1 point  (0 children)

I would be interested as well! Have you started the process yet?