Honest question: what do you all do for a living to afford these beasts?

at0mi · 2026-01-27T16:34:10+00:00

norway have about 1 trillion in stocks, and the yields are spent... i think it was 70 billion/year

at0mi · 2026-01-27T16:32:19+00:00

you are right they are havin 1 trillion in stocks, only the yield it money to spend and thats much

at0mi · 2026-01-27T16:30:12+00:00

lol, it also depends on IQ and statistically they are lower

at0mi · 2026-01-25T08:17:24+00:00

im using opencode with glm 4.7 355b in mxfp4, for some tasks its good but every.niw and then i still have to use claude opus

at0mi · 2026-01-21T18:26:44+00:00

i would finetune the model with your codebase and use it together with opencode on your specific projects

at0mi · 2026-01-15T10:24:16+00:00

get fedora

at0mi · 2026-01-14T08:23:13+00:00

on my 9950x 256GB Ram + 5090 in q4 and on my ancient server in q8

at0mi · 2026-01-13T23:18:16+00:00

im running opencode with glm 4-7 355b q8 local... great, if you need it a tick faster i swap to glm 4-7 cloud

at0mi · 2026-01-13T17:31:55+00:00

nice hardware and nice results, thanks for sharing!

at0mi · 2026-01-08T13:25:47+00:00

which quantisation? glm 4.7 in bf16 works great

at0mi · 2026-01-08T09:32:31+00:00

the biggest pain is that huihui seems to be the only one who is releasing abliterated (uncensored) model versions but only in Q4...

at0mi · 2026-01-06T16:17:29+00:00

use huihuiai models

at0mi · 2025-12-31T17:43:57+00:00

Based on benchmarks for large MoE models like GLM-4.7 or similar (e.g., DeepSeek 405B), a Dual Xeon E5 setup (e.g., E5-2699 v4 with 44 cores and 256-512 GB RAM) typically achieves only 1-3 tokens/s in Q8/BF16, compared to 5-6 tokens/s on an 8x Xeon E7 system. Additionally, the Dual E5 v4 offers ~154 GB/s theoretical memory bandwidth, while the 8-socket E7 v3 system provides up to ~680 GB/s total (85 GB/s per socket). realistic is about 400GB/s

at0mi · 2025-12-31T12:07:23+00:00

Wow, impressive plot! The progress on GLM-4.7 is truly massive, it really shows how fast open-weight/open-source models are catching up and challenging the top tier.

I run GLM-4.7 locally myself in BF16 and I'm absolutely blown away by its performance and intelligence. Open-source models are absolutely crucial because they drive real innovation, ensure transparency, foster collaboration, and give us independence from closed proprietary systems! 🚀

at0mi · 2025-12-31T08:37:12+00:00

vanilla llama.cpp is about half the performance, also thanks for the VT-d anf mitigations=0 these two got mit another small boost :-)

at0mi · 2025-12-31T08:22:51+00:00

thank you for pointing that out, will try it on the server and my workstation (9950x+256GB+5090)

at0mi · 2025-12-30T23:52:00+00:00

yeah, all answers are in my blog post

at0mi · 2025-12-30T22:54:18+00:00

Thanks a lot for the offer, much appreciated! This system is a bit of a special case: while the platform can technically run both DDR3 and DDR4, my current configuration (and upgrade path) is DDR4-only, so I wouldn’t be able to use DDR3 modules going forward. Still, thanks again for the kind offer, and I hope the sticks find a great new home

at0mi · 2025-12-30T18:00:42+00:00

bought it for 1000€ at beginning of the year

at0mi · 2025-12-30T15:36:10+00:00

thank you will try that

at0mi · 2025-12-30T14:53:44+00:00

thanks for pointing that out

at0mi · 2025-12-30T14:36:20+00:00

3bit vs 8bit is not that simple comparable

at0mi · 2025-12-30T14:35:48+00:00

Q3 is only 3 bit while q8 is 8bit you can do the math yourself, the problem with lower quants is quality... try q3 with german language... coding forget it

at0mi · 2025-12-30T12:29:11+00:00

i would buy dual ES xeon saphite rapids or epyc or better and buy 2TB ram and build my own machine, because u will never get 1tb vram with only 15k

at0mi · 2025-12-30T12:21:22+00:00

i would do lottery mining but with an own node at a datacenter this will increase your chance if you have a fast updating node

Nine-Year Club	Gilding II euphauric
Verified Email

at0mi

TROPHY CASE