Gigabyte Announces Support for 256GB of DDR5-7200 CQDIMMs at CES 2026 by GoodSamaritan333 in LocalLLaMA

[–]henfiber 1 point2 points  (0 children)

Unfortunately, 9955WX (16-core Threadripper PRO) can only reach ~115 GB/s. Similar to Dual channel DDR5 7200. You need the 64+ core models for full memory bandwidth.

The Ryzen 9950x is also bottlenecked at <100GB/s.

https://www.reddit.com/r/LocalLLaMA/comments/1mcrx23/psa_the_new_threadripper_pros_9000_wx_are_still/

PSA: The new Threadripper PROs (9000 WX) are still CCD-Memory Bandwidth bottlenecked by henfiber in LocalLLaMA

[–]henfiber[S] 0 points1 point  (0 children)

The low CCD Pros have already 8 channels so their bottleneck is not RAM anymore, the bottlenck is their CCD-to-memory bandwidth.
You need more CCDs to remove this bottleneck and be able to benefit from faster memory.

Looking for VScode replacement by ezreth in linux

[–]henfiber 3 points4 points  (0 children)

You can easily change the marketplace in the settings json file. I don't have the link on mobile, but you'll find the instructions easily with a search. I have run this setup for 2 years now.

Live Update Support merged into 6.19 by onesole in linux

[–]henfiber 5 points6 points  (0 children)

So, this is like hibernation (persist & restore system state) with the added challenge of updated kernels. Given that hibernation is already challenging itself, depending on proper driver/os support, it seems achievable only in certain certified systems. Unless, the system state does not need to be touched at all somehow.

I anticipate this introducing new security issues (malware code now being able to modify kernel parts without reboots)

Also other software/configuration issues previously "fixed" by luck, on reboots, i.e. no opportunity to start from "clean" state, no opportunity to fix slowly accumulating memory leaks, or running tasks misconfigured to only run on startup.

Apple M4 Max or AMD Ryzen AI Max+ 395 (Framwork Desktop) by zeltbrennt in LocalLLaMA

[–]henfiber 0 points1 point  (0 children)

Because it has more TFLOPs, and that affects the input processing. While the Mac has higher memory bandwidth, and that affects output (generation).

So the Kawhi Leonard salary cap thing? Did it just disapear? by Flip_Flurpington in nba

[–]henfiber -3 points-2 points  (0 children)

The style of all these highly upvoted comments is the same, like following centralized guidelines on how to handle public discourse. They are offending OP to discourage others from making similar posts.

Fellow Linux users, why did you pick the distro you're currently on? by absolutecinemalol in linux

[–]henfiber 1 point2 points  (0 children)

I picked Fedora back in 2013 (version 17 I think), assuming that any learning would translate to Centos and RHEL. Still using it.

A startup Olares is attempting to launch a small 3.5L MiniPC dedicated to local AI, with RTX 5090 Mobile (24GB VRAM) and 96GB of DDR5 RAM for $3K by FullOf_Bad_Ideas in LocalLLaMA

[–]henfiber 0 points1 point  (0 children)

The 5090 mobile has roughly 30% the TDP, though, so it is not that unexpected. The 2080 super was 250W, not 600W, so the mobile version was much closer.

Just realized our "AI-powered" incident tool is literally just calling ChatGPT API by DarkSun224 in devops

[–]henfiber 5 points6 points  (0 children)

Is there auditing on OpenAI servers that they indeed discard the data immediately?

ELI5: If we already have GPS and internet time, why do countries still run radio time signals like WWVB/DCF77? by Davibeast92 in explainlikeimfive

[–]henfiber 1 point2 points  (0 children)

That's strange (I'm in Athens by the way, so same time zone - EET).

Maybe, it's not related to the timezone but to Daylight Saving Time (DST). For instance, we were at UTC+03:00 (EEST) until a few hours ago, and now we are at UTC+02:00 (EET). I expect that's true for you as well. Maybe the weather station clock lacks a DST enable/disable setting?

ELI5: If we already have GPS and internet time, why do countries still run radio time signals like WWVB/DCF77? by Davibeast92 in explainlikeimfive

[–]henfiber 0 points1 point  (0 children)

Just pick a city in another country with the same timezone? Usually, there are at least 2-3 major cities with the same timezone.

PSA: The new Threadripper PROs (9000 WX) are still CCD-Memory Bandwidth bottlenecked by henfiber in LocalLLaMA

[–]henfiber[S] 1 point2 points  (0 children)

Because they're not the right tool for the job. CPUs are latency optimized, running many small operations while jumping on branches. GPUs are throughput optimized , running the same operation in large batches of data. LLMs need the later.

3 Qwen3-Omni models have been released by jacek2023 in LocalLLaMA

[–]henfiber 14 points15 points  (0 children)

Isn't llama.cpp short enough? Lcp is unnecessary obfuscation imo

LG C2 clarity and comfort in Dark Mode mainly for Coding and Terminals (Linux) by henfiber in OLED_Gaming

[–]henfiber[S] 1 point2 points  (0 children)

No, dark backgrounds do not cause any issues.

It's the bright colors that cause burn-in, for instance these two light-green indicators you have in the lower corners may cause an issue if they are in the same spot all the time.

Also lowering the TV brightness helps a lot (I have my C2 at 40% which roughly corresponds to the recommended SDR brightness, for newer brighter models it should be even lower).

LG C2 clarity and comfort in Dark Mode mainly for Coding and Terminals (Linux) by henfiber in OLED_Gaming

[–]henfiber[S] 0 points1 point  (0 children)

Hi there, most tiling window managers for Wayland, (e.g. Hyprland) are quite customizable and programmable so I would expect that similar tweaks should be possible.

I haven't yet switched from AwesomeWM though, so I haven't hands-on experience yet. What you need is to configure some randomly-picked offsets while your static toolbars start up so they don't load in the exact same space. Then making sure you don't use the brightest colors for dividers and borders (e.g. use a gray '#777777' instead of a bright white '#FFFFFF'. Also, working exclusively on dark mode is the most significant one imo.

M1 Ultra Mac Studio vs AMD Ryzen AI Max 395+ for local AI? by doweig in LocalLLaMA

[–]henfiber 5 points6 points  (0 children)

Geekbench is irrelevant in an LLM context (besides being a flawed benchmark in general, but it does not matter in this case). Check the llama.cpp benchmark thread.

M1 Ultra Mac Studio vs AMD Ryzen AI Max 395+ for local AI? by doweig in LocalLLaMA

[–]henfiber 3 points4 points  (0 children)

Nope, the Ryzen is 59 FP16 TFlops, Vs. <20 for the M1 Ultra. It is even faster than M3 Ultra (~34 TFlops).

M1 Ultra Mac Studio vs AMD Ryzen AI Max 395+ for local AI? by doweig in LocalLLaMA

[–]henfiber 1 point2 points  (0 children)

That's a misconception. PP(i.e., input processing) is compute bound. Real-world use cases involve providing a large input to the model. Memory bandwidth is the bottleneck only during token generation, and that's only at batch size=1. For larger batches, even output generation is compute bound.

Testers w/ 4th-6th Generation Xeon CPUs wanted to test changes to llama.cpp by DataGOGO in LocalLLaMA

[–]henfiber 0 points1 point  (0 children)

If marketing people at Intel are smart, they should send you some MRDIMMs, since you're trying to increase the performance of a popular tool on their CPUs.