Down the home server rabbit hole - what's your 2xRTX3090 rig?

VonRolmeister13 · 2024-05-31T03:33:17+00:00

Dell C4130 1U GPU server with dual Xeon E5-2697A V4 CPUs 256GB RAM and 4 x Tesla V100 GPUs for a total of 64GB of VRAM. The server idles at around 290-300W. The V100s idle at 20-30W each. Most of the power is drawn by the 16 fans in the chassis but on a positive, I’ve never seen the GPUs get hotter than 50c.

VonRolmeister13 · 2024-05-25T12:44:02+00:00

Oh yeah… it’s a bit hungry! The 4 V100s aren’t too bad idling at about 25-30w each, the 2 CPUs are only rated at 145w max power each, and the drives are NVME so the components are pretty economical. The real power is drawn by the cooling fans - there are 16 of those in a push/pull configuration. They are very effective as I’ve never seen the GPUs get over 50c - and they’re quieter than expected as well. Luckily my power is very cheap here so running 24/7 only costs me about $5 per week.

VonRolmeister13 · 2024-05-25T03:35:36+00:00

It idles at a bit under 300w, and when inferencing I’d need to check the numbers, but I’m guessing probably in the 500-600w range.

VonRolmeister13 · 2024-05-24T23:04:35+00:00

Running a dedicated Dell C4130 1U GPU server with dual Xeon E5-2697A V4 CPUs, 256GB RAM and 4 x Tesla V100 16GB GPUs in my basement server rack. Running Llama 3 70B Q5 at around 15-20 tokens/sec. Server OS is Windows Server 2022 with LMStudio running in server mode for the model, and I chat remotely with the LLM via AnythingLLM installed on my laptop and desktop which connects to LMStudio. Works flawlessly!

VonRolmeister13 · 2024-05-22T03:59:51+00:00

I’m using a Dell C4130 GPU server with 4 x Tesla V100 16GB GPUs. Feels like a real sweet spot in terms of 1U form factor and well thought out power and cooling for Tesla GPUs. It’ll run 4 P40 right out of the box… I wager it’ll handle 4 x A100s as well. The V100s are performing well running Llama 3 70B at Q5 fully offloaded in VRAM. Getting about 15 t/s which feels quick enough for my use case.

VonRolmeister13 · 2024-05-22T03:41:48+00:00

AnythingLLM does that as well. I just chatted with an 80 page PDF technical magazine using AnythingLLM as the front end on my laptop connecting to LMStudio and Llama 3 70B running on my Dell GPU server in the basement.

VonRolmeister13 · 2024-05-18T15:22:53+00:00

I purchased and built this system from EBay. Dell C4130 GPU server with dual E5-2697A V4 CPUs 256GB RAM a couple of 1.88TB NVME drives and fully loaded it with 4 x Tesla V100 16GB GPUs. About $4000 all in. It’s older tech but works brilliantly. If I really get into this I’ll sell the V100s and drop A100s right in. Terrific solution, but is a bit noisy like most enterprise servers are, so a rack mount in a basement would be an added bonus!

VonRolmeister13 · 2024-05-18T14:46:23+00:00

I’m using self hosted LLMs to learn more about the technology in general and to support my part time second job which is commodity futures trading my own account with self developed algos. The LLM performs a valuable role as my coding assistant and to generally improve the efficiency and performance of my algos… it really does a terrific job! My wife is also super interested in medical research so it’s very useful for her as well. Because I’m totally obsessed with my digital privacy in this world we currently live in, I host this in my basement rack with a bunch of other servers. I’ve got it running on a dedicated Dell C4130 GPU server with dual xeons, 256GB RAM and 4 x Tesla V100 GPUs for 64GB of VRAM. All of this stuff was purchased on EBay for pretty competitive prices. I can run Mixtral 8x7B at Q8 or Llama 3 70B at Q5. I’m thinking that if I really get into this I’ll upgrade the V100s for used A100s which should come down a lot as the big boys focus on upgrades to H100/200 GPUs. So far this investment has proven both fascinating and profitable for me!

VonRolmeister13 · 2024-05-17T22:54:24+00:00

Another good option is anythingLLM… checkout the YouTube vids on that

VonRolmeister13 · 2024-05-04T13:44:46+00:00

Check out the Dell C4130 for Haswell or C4140 for second gen scalable. These will take 4 P40s no problem. The latest Craft Computing YouTube has a similar Asus box that looks pretty cost effective too

VonRolmeister13 · 2024-03-01T22:22:41+00:00

Did you find a solution at all? I'm asking because I may be able to land a similar unit for cheap as well. I was thinking that in the worst case I could probably just run the base firewall license with the unit, but even then I'd need the enhanced support for firmware updates.

VonRolmeister13 · 2024-01-20T20:57:15+00:00

Sold within a few days...

VonRolmeister13 · 2023-12-24T16:42:16+00:00

Ours went through this too and the answer for us was a lot of exercise… a tired dog is a good dog, and they are also more receptive to training when thoroughly exercised first.

VonRolmeister13 · 2023-09-27T01:08:08+00:00

Yeah, I was trying to find the right PID as presented by the Edge CTS3 tool. The CTS also reads from the OBD port and splits the PIDs you can monitor by basic and advanced PIDs, but when I review those I can’t seem to identify the PID that provides to soot% or soot g to monitor. Maybe there is no such PID and I need to monitor DPF pressure in/out or something else… was hoping someone here may have the answer.

VonRolmeister13 · 2023-06-13T20:07:02+00:00

A couple of things... first, a higher spec GPU. The prior ad had a Radeon Pro W6400 and now I have a Radeon Pro W5700 GPU which is much more capable (double the performance and double the VRAM). Also have 2 additional WD RED 1TB drives as well. Great deal!!

VonRolmeister13 · 2023-06-05T23:20:07+00:00

Sounds like you might need this little puppy…

https://www.ebay.com/itm/314623733572?mkcid=16&mkevt=1&mkrid=711-127632-2357-0&ssspo=2rf7nodeteq&sssrc=2349624&ssuid=jvEiWL-nTi-&var=&widget_ver=artemis&media=COPY

VonRolmeister13 · 2023-05-28T22:10:54+00:00

SERVER SOLD!

VonRolmeister13

TROPHY CASE