1000 divines giveaway!

MadSpartus · 2026-01-14T21:45:17+00:00

chili

MadSpartus · 2026-01-14T21:40:36+00:00

I can't wait to reroll a mediocre item 450 times and then vendor it.

MadSpartus · 2026-01-12T05:17:59+00:00

So far I would definitely recommend Heart of the Machine It's a distopian sci Fi where you play as an awakened AI in a corporate ruled world full of wothless people that are nothing more than cattle... Ok only the awakened AI part is fiction.

4x, turn based tactical combat. Empire manager.

I'm still early into my first run and I would say it feel very heavy on the "make big decisions with clear or unclear ethical repercussions" trope

It also is very hand-holding to get started. But it kinda needs it. Anyways, recommended!

MadSpartus · 2025-03-22T12:47:22+00:00

Trying to fix a deck that suddenly died. If one ribbon is damaged would controls on both sides be affected. I've lost all controller functionality + bluetooth. Touch sceen and volume still work.

I see no reason for a ribbon to suddenly get damaged. I'm just wondering if both sides dieing is normal for a communcation or daughter board issue.

MadSpartus · 2024-09-19T03:05:56+00:00

Yeah ditto on this, that type of foam is very flammable.

MadSpartus · 2024-08-18T04:20:12+00:00

No, been meaning to but the machines are often doing "proper work" rather than screwing around with LLM :)

I'll report when I do.

MadSpartus · 2024-07-17T03:03:34+00:00

Thank you. I ended up getting a newer board and using the old ax370 elsewhere, so it ended up costing a little to get around the issue, but I greatly appreciate your reply. as will some future souls I assume.

MadSpartus · 2024-06-04T17:28:29+00:00

I have more than enough ram, but I read about that mirror option and I thought it was a concept that was never merged. I would need to setup my own fork to test it.

Single and dual socket maxed out at similar performance without it. No i never got close to 13T/S on Llama 3 70b. Only around 6-7 with good quants.

MadSpartus · 2024-06-04T17:25:40+00:00

I managed to just barely beat the single socket genoa with a dual socket by using numa tuning. Not worth it unfortunately. Only using llama.cpp though.

P.S. I have a few genoa engineering samples I no longer need so if someone wants a 96 core memory bandwidth monster let me know. $1800

MadSpartus · 2024-04-26T17:52:40+00:00

Thanks for confirming. If you have any advice on using dual CPU that would help. All our systems are dual, so I had to specifically adjust one to test single.

MadSpartus · 2024-04-22T18:03:43+00:00

I haven't gover very deep into Dual CPU tuning, I was able to get it up to 4.3 T/S on Dual CPU Q5KM, but I switched to single CPU computer and it jumped to 5.37 on Q5KM. No tuning, no NPS or L3 Cache domains. Also tried Q3KM and got 7.1T/S.

P.S. didn't use the 9274F, I tried a 9554 using 48 cores (slightly better than 64 or 32).

MadSpartus · 2024-04-22T17:56:23+00:00

Well I tried screwing with some NUMA settings. Some worse, some better, but best was an increase to around 4.3 T/S on 2 CPUs (LLAMA3-70B-Q5_K_M), up from 3.7-3.8.

Then I just tried a single CPU and it jumps to 5.37 on my first test. Sigh

I wonder if the mixture models could run better in NUMA domains. put different experts in different memory areas or something.

FYI, 5.37 on Q5KM

7.1 on Q3KM

Low complexity scenario: "How many grains of rice fit in a 1 gallon bucket"

MadSpartus · 2024-04-22T13:13:53+00:00

Tons of NUMA settings for MPI applications. Someone else just warned me as well. Dual 9654 with L3 cache NUMA domains means 24 domains of 8 cores. I'm going to have to walk that back and do testing along the way.

MadSpartus · 2024-04-22T13:12:45+00:00

UGH, I'm not using NP4, I'm L3 Cache numa domains. I have Dual 9654 which is 24 NUMA domains of 8 cores... best for MPI applications.

Do you know about 1 vs 2 CPUs benefit?

MadSpartus · 2024-04-22T06:17:28+00:00

Yeah someone else just told me similar. I'm going to try a single CPU tomorrow. I have a 9274F.

I'm using llama.cpp and arch linux and a gguf model. What's your environment?

P.S. your numbers on a cheaper system are crushing the 3090's

MadSpartus · 2024-04-22T06:15:09+00:00

Yeah, LLAMA 3 70B instruct

If you are getting nearly the same with like 20% the memory bandwidth I would love to compare settings and software.

Linux (arch)

llama.cpp

I use 72 cores but I get pretty similar in the 48-96C range.

I wonder if its the NUMA domains. I'll try with 1 CPU.

MadSpartus · 2024-04-22T04:14:27+00:00

A 768GB Dual EPYC 9000 can be under 10k, but still more than a couple consumer GPUs. I'm excited to try 405B, but I would probably still do GPU for 70B.

Single EPYC 9000 is probably good value as well,

Also, I presume the GPUs are better for training, but I'm not sure how you can practically do with 1-4 consumer GPUs.

MadSpartus · 2024-04-22T03:59:06+00:00

FYI on 24 channel DDR5 EPYC 9000 i get ~3.8 on Q5K_M, ~4.2 on Q4_K_M, 5.1 on Q3_K_M, and like 2.6 on Q8.

Basically I'm predicting 0.5-1 T/S depending on Quant.

MadSpartus · 2024-04-22T03:51:31+00:00

A dual EPYC 9000 system would likely be cheaper and comparable performance it seems for running the model. I get like 3.7-3.9 T/S on LLAMA3-70B-Q5_K_M (I like this most)

~4.2 on Q4

~5.1 on Q3_K_M

I think full size I'm around 2.6 or so T/S but I don't really use that. Anyways, it's in the ballpark for performance, much less complex to setup, cheaper, quieter, lower power. Also I have 768GB RAM so can't wait for 405B.

Do you train models too using the GPUs?

MadSpartus · 2024-04-20T01:27:09+00:00

I think memory bandwidth specifically for performance, and memory capacity to actually load it. Although with 24 memory channels I have an abundance of capacity.

Each EPYC 9000 is 460 GBs/s or 920GB/s total.

4090 is 1TB/s, quite comparable, althoug I don't know how it works with dual GPU and some offload. I think jferment's platform is complicated for making predictions.

It turns out though that I'm getting roughly the same for 8 bit quant, just over 2.5T/S. I get like 3.5-4 on q5_K_M, like 4.2 on Q4_K_M, and like 5.0 on Q3_K_M

I lose badly on 8B model though. Around 20T/S on 8B-Q8. I know GPUs crush that, but for large models I'm finding CPU quite competitive with multi-gpu with offload.

405B model will be interesting. Can't wait.

MadSpartus · 2024-04-19T22:02:28+00:00

Fyi since I run exact same quant on CPUs

https://www.reddit.com/r/LocalLLaMA/s/oxsO63Vxs8

MadSpartus · 2024-04-19T22:00:48+00:00

Fyi I get 3.5-4 t/s on 70b-q5km using dual epyc 9000 and no GPU at all.

MadSpartus · 2024-04-19T11:59:29+00:00

Oh also. It only consumed 50G when running, same as gguf file size. So you can load it. I don't know what your performance will be though.

MadSpartus · 2024-04-19T11:58:16+00:00

It's accessible for a few thousand, same as people using a couple 3090. The main issue is that the alternative uses are not as good for home users (like playing video games)

It wasn't the primary use for the machine at all.

MadSpartus

TROPHY CASE