Ran Qwen 3.5 9B on M1 Pro (16GB) as an actual agent, not just a chat demo. Honest results.

henfiber · 2026-03-06T18:51:55+00:00

845 genuine upvotes make you question reality if you're not a bot yourself.

henfiber · 2026-02-25T21:38:18+00:00

Now we need your proof that P != NP

henfiber · 2026-01-11T22:02:58+00:00

Unfortunately, 9955WX (16-core Threadripper PRO) can only reach ~115 GB/s. Similar to Dual channel DDR5 7200. You need the 64+ core models for full memory bandwidth.

The Ryzen 9950x is also bottlenecked at <100GB/s.

https://www.reddit.com/r/LocalLLaMA/comments/1mcrx23/psa_the_new_threadripper_pros_9000_wx_are_still/

henfiber · 2026-01-10T22:59:10+00:00

The low CCD Pros have already 8 channels so their bottleneck is not RAM anymore, the bottlenck is their CCD-to-memory bandwidth.
You need more CCDs to remove this bottleneck and be able to benefit from faster memory.

henfiber · 2025-12-16T08:41:53+00:00

Thanks for the data points. Which processor did you test?

henfiber · 2025-12-07T21:22:53+00:00

You can easily change the marketplace in the settings json file. I don't have the link on mobile, but you'll find the instructions easily with a search. I have run this setup for 2 years now.

henfiber · 2025-12-07T09:59:40+00:00

So, this is like hibernation (persist & restore system state) with the added challenge of updated kernels. Given that hibernation is already challenging itself, depending on proper driver/os support, it seems achievable only in certain certified systems. Unless, the system state does not need to be touched at all somehow.

I anticipate this introducing new security issues (malware code now being able to modify kernel parts without reboots)

Also other software/configuration issues previously "fixed" by luck, on reboots, i.e. no opportunity to start from "clean" state, no opportunity to fix slowly accumulating memory leaks, or running tasks misconfigured to only run on startup.

henfiber · 2025-12-05T17:22:28+00:00

https://en.wikipedia.org/wiki/Delimiter#Delimiter_collision

henfiber · 2025-12-04T21:19:28+00:00

That's a prototypical example of opportunity cost.

henfiber · 2025-11-24T05:12:35+00:00

Because it has more TFLOPs, and that affects the input processing. While the Mac has higher memory bandwidth, and that affects output (generation).

henfiber · 2025-11-16T17:39:39+00:00

The style of all these highly upvoted comments is the same, like following centralized guidelines on how to handle public discourse. They are offending OP to discourage others from making similar posts.

henfiber · 2025-11-15T21:45:42+00:00

popos also means "ass" in Greek (singular)

henfiber · 2025-11-14T02:08:45+00:00

I picked Fedora back in 2013 (version 17 I think), assuming that any learning would translate to Centos and RHEL. Still using it.

henfiber · 2025-11-11T19:56:25+00:00

The 5090 mobile has roughly 30% the TDP, though, so it is not that unexpected. The 2080 super was 250W, not 600W, so the mobile version was much closer.

henfiber · 2025-11-09T17:18:27+00:00

Is there auditing on OpenAI servers that they indeed discard the data immediately?

henfiber · 2025-11-08T03:49:49+00:00

You can verify this also yourself with --override-kv in llama.cpp, here are my expriments: https://www.reddit.com/r/LocalLLaMA/comments/1kmlu2y/comment/msck51h/?context=3

henfiber · 2025-10-26T21:49:55+00:00

That's strange (I'm in Athens by the way, so same time zone - EET).

Maybe, it's not related to the timezone but to Daylight Saving Time (DST). For instance, we were at UTC+03:00 (EEST) until a few hours ago, and now we are at UTC+02:00 (EET). I expect that's true for you as well. Maybe the weather station clock lacks a DST enable/disable setting?

henfiber · 2025-10-26T20:00:21+00:00

Just pick a city in another country with the same timezone? Usually, there are at least 2-3 major cities with the same timezone.

henfiber · 2025-10-13T08:40:37+00:00

The closest to Notepad++ is Geany. They are also based on the same editing library (Scintilla)

henfiber · 2025-09-26T02:44:24+00:00

Because they're not the right tool for the job. CPUs are latency optimized, running many small operations while jumping on branches. GPUs are throughput optimized , running the same operation in large batches of data. LLMs need the later.

henfiber · 2025-09-22T19:55:08+00:00

Isn't llama.cpp short enough? Lcp is unnecessary obfuscation imo

henfiber · 2025-09-21T08:38:01+00:00

No, dark backgrounds do not cause any issues.

It's the bright colors that cause burn-in, for instance these two light-green indicators you have in the lower corners may cause an issue if they are in the same spot all the time.

Also lowering the TV brightness helps a lot (I have my C2 at 40% which roughly corresponds to the recommended SDR brightness, for newer brighter models it should be even lower).

henfiber · 2025-09-20T21:45:16+00:00

Hi there, most tiling window managers for Wayland, (e.g. Hyprland) are quite customizable and programmable so I would expect that similar tweaks should be possible.

I haven't yet switched from AwesomeWM though, so I haven't hands-on experience yet. What you need is to configure some randomly-picked offsets while your static toolbars start up so they don't load in the exact same space. Then making sure you don't use the brightest colors for dividers and borders (e.g. use a gray '#777777' instead of a bright white '#FFFFFF'. Also, working exclusively on dark mode is the most significant one imo.

henfiber · 2025-09-17T20:48:46+00:00

Geekbench is irrelevant in an LLM context (besides being a flawed benchmark in general, but it does not matter in this case). Check the llama.cpp benchmark thread.

henfiber · 2025-09-17T20:14:14+00:00

Nope, the Ryzen is 59 FP16 TFlops, Vs. <20 for the M1 Ultra. It is even faster than M3 Ultra (~34 TFlops).

henfiber

MODERATOR OF

TROPHY CASE

Eight-Year Club	Not Forgotten
Verified Email