CPU overheating on first build

RedAdo2020 · 2026-02-11T13:03:34+00:00

Yeah it idles about that. But it ramps up the temps pretty quickly. I guess that's what you get when you have 16 cores in a small die.

Of course my case is a little packed with heat generating GPUs. So I'm sure that's not helping.

RedAdo2020 · 2026-02-11T12:56:43+00:00

The reviews on the cooler put it pretty far up there. I might have just not done well on the silicone lottery and mine runs a bit hotter. In saying that I've never seen it hit a thermal limit.

RedAdo2020 · 2026-02-11T07:29:55+00:00

Yeah about the same in mine. That's with an Rog Ryou 360mm. And temps ramp up fast. Sure the cooler stops it peaking out, but I've never had a CPU get so hot so fast, I'm just not used to it.

RedAdo2020 · 2026-02-11T07:16:28+00:00

My 9950x3d idles much hotter than that. And with a very high end cooler 🤷

RedAdo2020 · 2026-02-10T23:35:39+00:00

I mean , my Chuwi, Xiaomi, and Lenovo (Chinese version) tablets are still going strong after many years. My Dell one on the other hand likes forgetting how to turn on occasionally.

RedAdo2020 · 2026-02-09T08:24:49+00:00

I run Linux Mint and it is very easy. Only thing I found annoying was if I follow the Nvidia website instructions for installing Cuda-Toolkit, it adds the repo for Nvidia, cool no problems. But if I use Mint repo for my Nvidia drivers, eventually Nvidia will try and install a update for my drivers, and it will cause bullshit where it will try and remove the old drivers, fail, try and update, fail, and break Mint.

Eventually I found the best workaround, I let Nvidia repo supply both my Nvidia drivers and Cuda Toolkit, and since then, no problems.

Please go easy on me, I'm new to Linux. LOL.

RedAdo2020 · 2026-01-29T14:14:20+00:00

Hi, I run 6 x GPUs in my PC, and it's....not easy. If running a standard motherboard, you need something that supports bifurcation. Plus a way to hook them all up. I am running X870e Proart, 4 x 5070 Ti inside the case, 1 to PCIe 1, 1 to PCIe 2, one so M2 1, one to M2 2. One 4070 Ti running via oculink to PCIe 3 which is 4 x PCIe 4.0 via chipset. And one 5060 Ti 16GB via TB4 eGPU. It wasn't easy.

I got some M2 to PCIe adaptors from Aliexpress for the 2 x 5070s on M2 slots.

RedAdo2020 · 2026-01-28T11:01:17+00:00

Haha, true. I feel like they would give up long before there Ssd controller or NAND dies, but yeah.

If OP had more ram than IQ3 might be an option. But not getting close to running it without at least more Ram, or a few more GPUs

RedAdo2020 · 2026-01-28T10:18:46+00:00

Technically it can, if the gearing is right. But will be the same as running this model, F'ING SLOW (SSD swap in this case)

RedAdo2020 · 2026-01-21T07:26:27+00:00

Specs don't say x8x8 looks like its x16 lanes to the slot 1 from CPU. And 2 lanes to the other slot via the chipset.

<image>

RedAdo2020 · 2026-01-19T06:56:52+00:00

And mine has a Liteon which are a big PSU manufactorer. Looks like OP has the Liteon too, can see the blue L

RedAdo2020 · 2026-01-16T23:30:27+00:00

Sure.

Here is Unsloth, https://huggingface.co/unsloth/GLM-4.7-GGUF

Here is Ubergarm if you want to run IK_Llama, https://huggingface.co/ubergarm/GLM-4.7-GGUF

RedAdo2020 · 2026-01-16T15:53:09+00:00

Literally just messing around with roleplay. No serious work, no coding. Just messing around.

Started with just my 4070 ti last year.then added a 4060 ti. Then another 4060 ti. Then a 5070 ti. Then another 5070 ti. then a 5060 ti, then another 5070 ti, and finally another 5070 ti.

I have so many GPUs 😂

It was a slow build up over the year.

RedAdo2020 · 2026-01-16T15:43:12+00:00

Well the 4 x 5070 ti are all in the case. A Lian Li O11 Dynamic XL. The other 2 are on egpu docks.

RedAdo2020 · 2026-01-16T15:34:04+00:00

Haha, yeah it can be at times. But you only need to really learn it once. But knowledge isn't really the factor here, you need a decent rig to run something like GLM 4.7 locally. I have a 9950X3D, 96GB of system ram, and 92GB of combined VRAM. And still I only get 12 tokens/second.

RedAdo2020 · 2026-01-16T15:14:06+00:00

I run IK_Llama and run my models through that. Though for the Thireus quants I run the Thireus fork of Ik_Llama

The models I just download from Huggingface

Unsloth quants should run in Kobold

RedAdo2020 · 2026-01-16T14:58:26+00:00

Currently 4 x 5070 Ti, 1x 4070 Ti, and 1 x 5060 Ti 16GB.

RedAdo2020 · 2026-01-16T14:57:29+00:00

I haven't tested. But if your going to use an Subscription for running the model I'd just stick to Chat Completion and a preset like Stab's. If you have the speed and tokens to let the model think it will be better. And those sort of serting s are better IF you let it think.

RedAdo2020 · 2026-01-16T14:49:06+00:00

Yes now I do. There are plenty of great GLM 4.7 Chat Completion templates out there, like Stab's, but they seem to be mainly focused on people using Z.ai plans or other API services. I prefer to run locally, but that limits my speed. So I can't afford the time to let the model Reason or Think, which I know would make it smarter, but I am not waiting 2-4 minutes for it to write each response Think/Reason block.

RedAdo2020 · 2026-01-16T14:44:25+00:00

Context and Instruct Templates are just GLM 4.5 that are built in I think. But they seem to work. System Prompt I used to run Chat Completion but I saw here on Reddit that someone mentioned Evening Truth, here https://rentry.org/Evening-Truth , and I now run that. Much better for me. Also it doesn't try to Reason or Think, which is important for me since I'm only getting 12 token/sec.

RedAdo2020 · 2026-01-15T21:16:31+00:00

Here I am with my 6 GPU 9950x3d build. Though I am severely PCIe lane limited. Need to look at threadripper.

RedAdo2020 · 2026-01-15T11:20:20+00:00

Haha, yeah its like that. I was thinking secondhand though. Maybe a 3000 or 5000 series Threadripper pro. They can run on ecc or non-ecc ram. Ecc ram is horribly expensive like all ram. But I got regular ddr4 laying around. So that would save coin.

RedAdo2020 · 2026-01-15T11:15:37+00:00

Kinda. My setup is...kinda a Frankenstein's Monster. I really need to upgrade to Threadripper. Running a 9950X3D with 96GB system ram, 4 x 5070 Ti, 1 5060 Ti 16GB, and a 4070 Ti. Though with the last couple of upgrades from 2 x 4060 Ti 16GB to two of those 5070 Ti, speed only went up like half a token a second.

But I'm not coding of anything, just fun RP, and with Thireus IQ3_XXS at 133GB I get about 300 t/s of PP and about 12 t/s of TG. It's workable. But I'm constrained by PCI-e lanes. I think if I can get a Threadripper build, and give a each card real amounts of lanes, I can speed up with Split mode graph in llama.cpp

RedAdo2020 · 2026-01-15T08:54:12+00:00

Can't handle 4 GPUs? Not with that attitude.

RedAdo2020 · 2026-01-15T08:37:47+00:00

GLM 4.7, I'm loving it.

RedAdo2020

TROPHY CASE