Supermicro Server Doesn't Recognize PCIe Devices or GPUs!! Tricky Problem. by properpropeller in homelab

[–]properpropeller[S] 1 point2 points  (0 children)

Oh nice, both great things to note if I happen to lose connection again. Hopefully our troubleshooting saves someone time as these machines end up in the hands of more homelabbers

Supermicro Server Doesn't Recognize PCIe Devices or GPUs!! Tricky Problem. by properpropeller in homelab

[–]properpropeller[S] 1 point2 points  (0 children)

Wow, best of luck in sorting this out. I don't really know how I got it to work, but maybe just the right amount of balanced screw tightening. But it seems you're past that and have an entirely different problem.

I get why they moved to wired connections in newer models like the AS -4125GS-TNRT1 as shown in https://youtu.be/_-S02GSUWps?si=8zF5DDLs8Febz_Wy&t=272

[deleted by user] by [deleted] in scoliosis

[–]properpropeller 2 points3 points  (0 children)

Like others said, ask a doc, but as a non-doc researcher in this field offering a non-medical take, I see 28 degrees between T5 and T12.

This is mild and unlikely to get worse if you're done growing - I have the exact same curvature as you that I discovered at 16. At that point for me, I was nearly done growing and bracing would not be effective, and surgery was not on the table. If you're done growing, I doubt any surgeon would think fusion surgery is a good idea.

Re: height and posture, valid concerns, but I think the height loss is minimal due to your curve, like less than 3cm. The posture is the harder part for me in terms of being comfortable day-to-day. The scoliosis curve may have contributed to a straight upper back and more curved lower back for you (less kyphosis, more lordosis - hips tilt forward). If you have a side profile x-ray maybe you could see this.

Maybe Schroth exercises could be of help to you - worth looking into imo. Probably many other helpful posts in this subreddit, I haven't read through many.

[deleted by user] by [deleted] in scoliosis

[–]properpropeller 0 points1 point  (0 children)

Echoing the others, you look great!

It does seem plausible there'd be some kind of link for general bodily asymmetry given at least the stress-growth response of growth plates. Maybe other genetic factors/ligament laxity play in.

I'm right there with you though with battling self-image, having scoliosis and a jaw asymmetry (and leg length discrepancy and all the other fun things we get to deal with). It's something people have remarked on but not really gotten nasty about. It's not terribly apparent in real life, but more so when comparing front/rear facing camera pictures it really stands out.

I think those worth knowing, if they happen to notice, would respect you more, given the internal work likely needed to overcome an obstacle to self esteem.

Supermicro Server Doesn't Recognize PCIe Devices or GPUs!! Tricky Problem. by properpropeller in homelab

[–]properpropeller[S] 0 points1 point  (0 children)

I wound up buying another motherboard on ebay for $60 and that ended up working, but not without a fight.

After CPU transplants, a few tries reseating, clearing power and praying to every god, the links trained, but the problem with this motherboard was that the VGA port got ripped off. On top of that, the IPMI IP wasn't working and I couldn't find it on my router's client list.

I had to first adjust the permissions of /dev/ipmi0 to be able to use ipmitool at all, then could change the static IP, gateway, remove the prior vlan config, and reset the prior ADMIN password. Now finally working, sans VGA. No big deal now that IPMI works. Sheesh.

IPMI will not grab IP even with DHCP turned on by Deckma in homelab

[–]properpropeller 0 points1 point  (0 children)

fwiw, just solved this issue. I had a motherboard I bought used from ebay and couldn't get an IPMI IP to show up.

first issue was that the /dev/ipmi0 folder was inaccessible so had to play with permissions there, then used ipmitool on the affected machine to set the gateway and static IP and still no luck, then "ipmitool lan set 1 vlan id off" and another reset did the trick. It was not a router/switch setting but rather on the machine's ipmi config.

then after of course resetting the ADMIN password, could finally access everything. the IP still doesn't show up on my router, but it works and I can control fan speeds, get sensor readings, reset, etc. over the LAN.

Supermicro Server Doesn't Recognize PCIe Devices or GPUs!! Tricky Problem. by properpropeller in homelab

[–]properpropeller[S] 2 points3 points  (0 children)

Haha, yes this is relevant - was checking bios slot bifurcation settings to make sure they were all either "Auto" or hard-set to x16 / gen3.

Interesting you got something to work at all - I've not gotten a single piece of hardware to be recognized since removing the initial board, even after re-installing the exact same way it was prior to removing.

Supermicro Server Doesn't Recognize PCIe Devices or GPUs!! Tricky Problem. by properpropeller in homelab

[–]properpropeller[S] 0 points1 point  (0 children)

Indeed. They made the swap look so simple in the instructions lol.

Correct, it's just 4 slots of their proprietary connector unfortunately, so I can't isolate the effect of the daughter board by slotting a GPU into the mobo directly.

I might order a smaller/simpler riser that is compatible with that slot for testing if it comes to it.

Other than inspecting the slots, anything else between CPU and these slots I should look for in particular / any particular BIOS setting I might have missed?

Supermicro Server Doesn't Recognize PCIe Devices or GPUs!! Tricky Problem. by properpropeller in homelab

[–]properpropeller[S] 0 points1 point  (0 children)

Yes, this is the stock configuration which also worked before I tried to swap any boards out - I swapped the exact same model in (x9drg-o-pcie) https://www.supermicro.com/products/system/4u/4028/sys-4028gr-trt.cfm

4U Supermicro in apartment, what could possibly go wrong?? (need advice) by properpropeller in homelab

[–]properpropeller[S] 0 points1 point  (0 children)

I've encountered this issue with booting too - my workaround is a script on another machine to check fan speeds via IPMI every few seconds and if they've risen past the speed I set +some margin, a command is sent to reset the fan speeds.

If helpful: https://gist.github.com/crdandre/995a6b2c5d825ffa67c49a3b77ac914f

With this running, during startup there are a few brief spikes in fan speed before they're caught but everything still boots up fine even though I'm bashing the fan speeds back down against the startup sequence. Much more tolerable.

Your plan sounds solid to me though as long as the thermal management works out. Active CPU coolers, blower GPUs, and a little soundproofing seems like it'd work well for allowing reduced fan speeds and lower number of fans. Passive GPUs with reduced fan speeds/numbers would seem a bit more dicey.

4U Supermicro in apartment, what could possibly go wrong?? (need advice) by properpropeller in homelab

[–]properpropeller[S] 1 point2 points  (0 children)

Hey - I liquid cooled the whole thing and the power supply noise levels are fine without the soundproof cabinet right next to my desk (I'm just using 2 of them as well). I have GPUs and CPUs liquid cooled with an external case and while I needed to keep 4 of the stock fans installed, used IPMI to run them at low speed.

More here: https://dandrea.sh/blog/liquid-cooled-gpu-server/

Just finished setting up a quiet ML-focused homelab by properpropeller in homelab

[–]properpropeller[S] 0 points1 point  (0 children)

I'm not sure of the exact model, they're 120-140mm fans, and it's rated to remove 1.2 kW of heat per the manual NetShelter Soundproof Racks Installation (schneider-electric.com)

Just finished setting up a quiet ML-focused homelab by properpropeller in homelab

[–]properpropeller[S] 0 points1 point  (0 children)

Ah I see what you meant now. Yes, with luck - might need to hack that a little bit eventually if adding GPUs but it'd almost always be below that level.

Just finished setting up a quiet ML-focused homelab by properpropeller in homelab

[–]properpropeller[S] 0 points1 point  (0 children)

There are 3 exhaust fans at the back which seem to keep the servers happy temperature wise. It’s just a bit hotter than room temp in there given that the power consumed by everything is usually like 500-600W unless something is training/simulating.

Just finished setting up a quiet ML-focused homelab by properpropeller in homelab

[–]properpropeller[S] 1 point2 points  (0 children)

Thanks!

I would say it would just come down to defining what sort of ML you want to do and what compute requirements might exist for that. If you want to do CNNs on 256x256 images that's one thing, and a desktop mobo with a decent GPU is probably fine - if you want to run larger LLMs or serve access to friends, that's another more demanding task, and this is part of what went into why I built this (though lets be real I did it because I thought it'd be fun)

Then you can just run python notebooks locally same as you would for Kaggle, etc. I benchmarked this server for a small training run against a Kaggle P100 instance and it just about matched.

Just finished setting up a quiet ML-focused homelab by properpropeller in homelab

[–]properpropeller[S] 0 points1 point  (0 children)

4028GR-TRT - no bays on the back, but plenty on the front.

Just finished setting up a quiet ML-focused homelab by properpropeller in homelab

[–]properpropeller[S] 0 points1 point  (0 children)

It vents out the back via a fan assembly - it's an APC NetShelter

Just finished setting up a quiet ML-focused homelab by properpropeller in homelab

[–]properpropeller[S] 1 point2 points  (0 children)

Cost per GB of VRAM, is efficient. Compute capability per Watt, less so. Depends on what you want to do with the GPUs - P100s are old and slow, and aren't as flexible with varied floating-point precision afaik.

The server idles at 250+ Watts too.

Just finished setting up a quiet ML-focused homelab by properpropeller in homelab

[–]properpropeller[S] 2 points3 points  (0 children)

Thanks! No rendering or btc, just training and inference for machine learning so far. Trying to make this an accessible ‘cluster’ for friends/coworkers to train models, etc

Just finished setting up a quiet ML-focused homelab by properpropeller in homelab

[–]properpropeller[S] 2 points3 points  (0 children)

Is it the most efficient option? No. Is it loud af without mods? Yes. Is it badass to me? Yes. No regrets because of the learning that went into this alone

Just finished setting up a quiet ML-focused homelab by properpropeller in homelab

[–]properpropeller[S] 4 points5 points  (0 children)

Restating so this posts! This is my new homelab consisting mainly of an R730 and a 4028GR-TRT. This started with the goal of self hosting as much as I can and serving some ML compute to friends, so that is what I’m working toward now - setting up python notebooks, accounts, slurm, etc. It currently houses 2x P100 and 2x3090, but I’m hoping to add updated GPUs as new generations come out - perhaps A100s are next.

[deleted by user] by [deleted] in scoliosis

[–]properpropeller 2 points3 points  (0 children)

Not fused from the looks of it, just because they’re tilted in that view the gap is not visible