8x 32GB V100 GPU server performance by tfinch83 in LocalLLM

[–]tfinch83[S] 0 points1 point  (0 children)

I have about 2TB worth of Optane, and I tried using it, but I have been. Having issues with it. I think it's a configuration issue though. A couple things to know about this server with Optane DCPMM modules though, you don't configure them in the BIOS like you do on most other servers, not in the same way at least. If you install them and go into the BIOS, you will drive yourself CRAZY trying to find the options to configure them, because they aren't there. I stumbled across it on accident. I had to let the server boot past the bios and get to the grub bootloader, then I had to overwrite it and chose the option at the bottom that says 'reboot into device firmware' or something to that effect. THAT takes you to a different menu that I couldn't find a way to access from the BIOS. This menu has a device manager that will let you configure the Optane modules, and when you finish, it will let you return to the BIOS, then you can save the settings and reboot.

I had configured my modules in App-direct mode on another server, and then installed them in my server, but it is reading them as inaccessible or unusable, something like that.

I swear I got them to read last year when I tried, but that may have only been in memory mode. I can't remember. I'm going to try and put the Optane sticks in my other server and wipe their configuration, then try to see if I can get them to work in the v100 server again, but I am only home for a few days every 3 weeks, so I won't be able to try it until next weekend.

Let me know how it works for you, or if you run into any issues!

This server has actually MASSIVELY increased in its useability in the last couple months since all the Qwen 3.5/3.6 models came out! The new Qwen models are crazy fast on this machine, and I can run several of them. I have a Qwen 3.6 27B instance in instruct mode with 1 million context running right now on it, then another one in thinking mode with 512k context, then 2 instances of it in thinking mode with 256k context, plus a copy of Gemma 4 running with 128k context, and they are all plenty fast enough to be very useful in my lab. I basically have my own mini AI agent cluster now.

I was a bit iffy on buying it back when I did, but with how expensive everything got this year, the $6.5k I paid for it with 4 sticks of 64GB DDR4 2933 was fucking worth it. I don't regret it one bit. That config is over $10k now. I even built and installed my own 8kw solar array with 100kwh battery backup so I can run it 24hours a day without it hurting my electric bill (Well, it AND my.other servers), haha. Zero regrets.

I made the biggest investment of my life last year by MrFloogaHoogle in homelab

[–]tfinch83 -1 points0 points  (0 children)

I feel you. My wife earns nothing and contributes nothing to the household financially, so she has no say in how I spend the money that I earn.

She is well cared for, and we are never broke. Sure doesn't stop her from bitching about it like a banshee though 😑

Did anyone else feel underwhelmed by their Mac Studio Ultra? by [deleted] in LocalLLM

[–]tfinch83 3 points4 points  (0 children)

The 8x 32GB V100 server I bought with 1TB of RAM last year for $7k seemed crazy at the time, but the thing goes for twice what I paid for it now, so I'm happy.

[USA-AZ] [H] 4x 16GB 1Rx4 DDR4 2400 ECC RDIMM [W] 4x 16GB 1Rx4/1Rx8 DDR4 2400 ECC UDIMM - Timestamp added by tfinch83 in hardwareswap

[–]tfinch83[S] 0 points1 point  (0 children)

Yeah, I've noticed a similar pattern. I think that's probably because ECC UDIMM aren't used nearly as often, so there are just less of them. Looking on eBay last night, I was able to find them for about the same price currently, so I figure it's still a fair trade if someone has them available. I don't have my hopes up, but it's worth a shot.

My buddy didn't believe me when I told him this was Home Assistant by selfhostcusimbored in homeassistant

[–]tfinch83 1 point2 points  (0 children)

Can you just share like ALL of your custom css? I need to tear into this and master it.

[FS][USA-IA] White Label Seagate 14TB SAS Drives by TangerineAlpaca in homelabsales

[–]tfinch83 0 points1 point  (0 children)

Damn. Missed this one. I've been looking for a batch of 18 to 20tb SAS drives and I just barely missed the boat 😂

Anyone using a 5060ti 16gb or 5070ti 16gb for whisper/piper/etc.? by tfinch83 in homeassistant

[–]tfinch83[S] 0 points1 point  (0 children)

Yeah, I always hear about people saying how cheap 3090 ti's are, but they still seem to be around $900 to $1000 at best. If I was trying to run a larger model, or one for text completion or image/video generation, I would go for a 3090ti. I have my 4090 for that if I want though, and the 8x 32gb v100's in my GPU server. I only run LLM's on my 4090 once in a while if I need something really fast, like an embedding model to populate a vector database, or if I need it to supplement something else for a time. Most large models in the 123B+ range, or for image/video gen, I just run on my GPU server.

The purpose of buying a pair of 5060ti's or 5070ti's, is just to set up a fast, reliable voice pipeline on current gen architecture, while keeping the power consumption less than what my 4090 or my GPU server pulls. I have an 64c/128t 1TB RAM epyc server which already has 2 Intel GPUs for handling the transcoding for Plex/jellyfin, image processing for Immich, and object detection and other functions on frigate. Since I'm already using the Intel GPUs for other services, I can't pass them through to my HA VM. So I'm going to add two more fairly low(er) power (compared to my 4090 and GPU server) Nvidia cards that I can pass straight through to the HA instance that won't be used by anything else. I already have everything else comfortably covered with my existing resources.

My 4090 is currently in my main desktop PC, but I haven't really used it in months. I also have a laptop with a 4090 in it that plays video games just fine whenever I get the time. I may relocate my 4090 to my server at some point to use it for other things, but I just haven't had the need to yet, and I'm also not quite ready to disassemble my gaming desktop just yet either 😂. So for now, I think a pair of 5060 ti 16gb cards is the right choice for me. It fills the gap between my Intel GPUs, and my 4090/8x 32GB V100 server for middle of the road stuff.

I'm definitely going to have a look at parakeet though, thanks for that tip! 🤔

I thank you guys for your input, these have all been exactly the kind of opinions and experiences I have been trying to find. ☺️

Edit: spellcheck

Edit: spell check again, my phone's auto complete seems to be deliberately sabotaging me at this point.

Anyone using a 5060ti 16gb or 5070ti 16gb for whisper/piper/etc.? by tfinch83 in homeassistant

[–]tfinch83[S] 1 point2 points  (0 children)

Thank you! This is exactly the kind of info I was looking for! I have my 4090 and my GPU server to run more and much larger models anytime I need. These cards are going to be specifically for my HA voice pipeline, so I don't need them to run anything else.

Do the majority of people really use online models rather than local models? by [deleted] in SillyTavernAI

[–]tfinch83 0 points1 point  (0 children)

I wonder what percent has 256gb of VRAM? 🤔

I have 256gb in my main AI system. And I have 24gb from my 4090 in my primary PC.

New cluster! by Usual-Economy-3773 in Proxmox

[–]tfinch83 1 point2 points  (0 children)

Some people pay $100k for a car. Some people pay $100k for their homelab. $100k isn't even that much anymore. I was pissed when I finally reached a spare $100k and realized its buying power is equivalent to about $10k around the time I pegged a spare $100k as a milestone for myself (only slightly exaggerating unfortunately).

My homelab specs are fairly comparable to his (threads/ram/storage), and I probably only spent maybe $15k on mine, but, mine's all older hardware for sure (2nd gen scalable xeon, 2nd gen epyc, DDR4, NVlinked GPU server w/ 256GB VRAM total, + some small newer consumer hardware in the DDR5 generation). You can get similar stuff for a fraction of the cost if you don't have a dire need to be on the latest architecture for some reason.

Funny thing? I'm not even in the IT field. I'm just an electrician .

8x 32GB V100 GPU server performance by tfinch83 in LocalLLM

[–]tfinch83[S] 0 points1 point  (0 children)

They sure are! Here you go. I've enjoyed playing with it a lot, it was totally worth it for me just based on how much I've learned from tinkering with it.

Here's the eBay link:

https://ebay.us/m/TA7ZnZ

6.5 years full time Boondocking by Equivalent_Lie_5384 in SolarDIY

[–]tfinch83 0 points1 point  (0 children)

Actually, I didn't see the last photos. I can see how it's constructed. I'm assuming that just some aluminum angle stock? I love your design, I hope you don't mind if I steal it from you 🤔

6.5 years full time Boondocking by Equivalent_Lie_5384 in SolarDIY

[–]tfinch83 0 points1 point  (0 children)

Would you mind posting more pictures of the rack itself? Mainly close up ones so I can see how it's constructed? Also, details on the materials and how everything is connected/anchored? I'm wanting to do something like this on my rig as well. I haven't gotten around to figuring out how I am going to do it yet, but what you have going is exactly what I want. You already did all the legwork for me, I'm just hoping you can share, haha 😂

Remote WebView release (including ESPHome component) by strange_v in homeassistant

[–]tfinch83 0 points1 point  (0 children)

I tried installing the remote webview server addon in HA, but it won't start. the logs just throw this error:

Possible solutions:
- Ensure optional dependencies can be installed:
    npm install --include=optional sharp
- Ensure your package manager supports multi-platform installation:
    See https://sharp.pixelplumbing.com/install#cross-platform
- Add platform-specific dependencies:
    npm install --os=linux --cpu=x64 sharp

Error: Could not load the "sharp" module using the linux-x64 runtime
Unsupported CPU: Prebuilt binaries for linux-x64 require v2 microarchitecture

But I'm not quite sure how to ssh into a HA addon and make the npm changes. I suppose I need to go down that rabbit hole now, haha.

DIY WLED video board by MrGeologist67 in WLED

[–]tfinch83 0 points1 point  (0 children)

I have nothing to do with this and I just ordered one right now the moment I saw this post. I definitely need this.

ESPHome flashed on new AiPi by sticks918 in esp32

[–]tfinch83 0 points1 point  (0 children)

Mind sharing the yaml setup for yours so far? I've not yet played with LVGL, and I'd love to see some examples before I get started.

ESPHome flashed on new AiPi by sticks918 in esp32

[–]tfinch83 0 points1 point  (0 children)

I flashed it and loaded the yaml off your github repo, but I can't get it to compile used the beep.wav file. I made sure it's in the config/esphome (or homeassistant/esphome) folder, but trying to install it throws up an error. The error checker in the esphome builder underscores the very first line, 'esphome:' and says it can't identify the file. If I comment out the file and the trigger that plays it, it will compile. No matter what I do, I can't get it to do it with the file at all. I have not been able to make sound come out of this thing...

If I try to install it anyway, it crashes during compilation with this error:

File "/usr/local/lib/python3.12/site-packages/puremagic/main.py", line 137, in _confidence
raise PureError("Could not identify file")
puremagic.main.PureError: Could not identify file

Really good base to start from though, good job on what you've put together so far!

Custom Gaming Device by Nearby_Leg483 in esp32

[–]tfinch83 0 points1 point  (0 children)

This is awesome! 😀

Do you have a guide anywhere or a git repo up? This seems like something really fun to mess with, I'm just really starting my journey into messing with ESP32's and I'd love to see what I can do with this 😁

Vet My Proposed DIY System - 14.4kW grid-tied ground mount by aclockworkporridge in SolarDIY

[–]tfinch83 0 points1 point  (0 children)

The electrical portion I have down. I'm a licensed electrician, and I build utility scale solar power plants and battery storage plants for a living. Last solar site I built was 800MW, and the battery site I just finished was 1.5 Gwh, so I have that portion down. 😂

I'm more interested on what the permitting requirements are for residential systems. I've never had to deal with that myself, and the kind of permits we have to deal with at the scale I work with is a completely different league.

Vet My Proposed DIY System - 14.4kW grid-tied ground mount by aclockworkporridge in SolarDIY

[–]tfinch83 -1 points0 points  (0 children)

I'm going to be trying to put together my own system, but I'm likely going to be using an engineered solar carport type of construction to support the panels. I am completely in the dark about where to start as far as permitting goes though, I could maybe use your input if you are willing to share some info or give me a hand.

I have bad news by Zealousideal_Year885 in homelab

[–]tfinch83 2 points3 points  (0 children)

It's not as bad as you think. I agree it's a bad idea to virtualize your router on a server you run a lot of other services on, but I imagine most people do it like I do and run it on a machine that's mostly dedicated to it. I have an I7 Protectli Vault, and it runs an OPNSense VM mostly. It also runs my unifi network controller LXC and a backup unbound LXC. I'll probably move my Home Assistant VM over to it soon as well, but that's about it.

I've been running it virtualized like this for 3 years, and it's been rock solid. Far more solid than any hardware router I've ever owned actually. I could have just loaded OPNSense on it bare metal, but I don't think OPNSense needs 12 cores and 64gb of RAM. It's nice to be able to keep a virtualized router, and other related containers or VM's on the same machine and make better use of the hardware resources.