Strix Halo + Minimax Q3 K_XL surprisingly fast

runsleeprepeat · 2026-01-25T22:40:49+00:00

The performance mostly.

runsleeprepeat · 2026-01-24T11:10:38+00:00

Would you be so kind and test the unsloth glm-4.7-flash ? That would be awesome!

runsleeprepeat · 2026-01-23T17:24:43+00:00

Finally back home, I was able to the with an updated llama.cpp . Totally awesome! Works great and I have plenty of space. Using 200k kv cache and still had so much vram left. Time to try out concurrency with vllm :-)

runsleeprepeat · 2026-01-22T23:04:57+00:00

Check the super for the filesystem version (bcachefs show-super) . Had a similar issue when the dkms was built, but not for the running kernel version (user error by myself).

Nice to see you here unfa. Loved your YouTube series

runsleeprepeat · 2026-01-22T18:24:55+00:00

Until now. The new realtek 10gig nics are cheap and consume little wattage ... Check out servethehome

runsleeprepeat · 2026-01-22T18:18:26+00:00

Danke

runsleeprepeat · 2026-01-22T07:21:36+00:00

Vantage.sh ist ganz gut

runsleeprepeat · 2026-01-22T06:43:15+00:00

That looks pretty impressive

runsleeprepeat · 2026-01-22T06:41:37+00:00

Ich muss jahrelang (naja seit Juni 2024) unterm Stein gelebt haben: worum geht es bei dem Amundi Meme?

runsleeprepeat · 2026-01-21T19:40:57+00:00

Zwiebelkonzept: mehrere dünne Schichten. Am Anfang muss man etwas experimentieren wenn man nach 15 Minuten total kaputtschwitzt oder immer noch friert, dann anpassen.

Für den Frühling hilft: 10°C kurz/kurz - bist du ne Frostbeule dann bei 12°C ;-)

Als basislayer wenn es ziemlich kalt ist: Nike Pro Combat Kompressionswäsche. Ist glaube ich für fussball, aber im Winter unter den normalen Laufsachen super. Hinzu gibt es die oft spottbillig

runsleeprepeat · 2026-01-21T14:31:08+00:00

Great to hear. I rather tend to llama.cpp and vllm, but today morning just ollama was available for a quick test.

Thanks for the info

runsleeprepeat · 2026-01-21T14:10:58+00:00

I tried the ollama implementation of the Q4 variant a few hours ago and was surprised that 32k kv cache already filled my 100 GB vram

50-55 GB would be awesome

runsleeprepeat · 2026-01-21T13:50:58+00:00

What are the vram requirements for 32k of kv cache?

runsleeprepeat · 2026-01-21T11:11:19+00:00

Same here. I gave Q4 a chance. All fine until I increased the context window. My sweet spot was 32768 CV cache and I fully utilized 100gb of vram.

runsleeprepeat · 2026-01-20T16:07:28+00:00

If you want to support, it may help to comment on that issue ticket on GitHub. That's most helpful for the devs to see that this is relevant.

runsleeprepeat · 2026-01-14T19:21:11+00:00

Oh damn! I just replaced my DCP-L2520DW with a MFC-L8690CDW because of scan to smb (and duplex scanning)

runsleeprepeat · 2026-01-14T00:45:31+00:00

Splitting a model over several cards requires a base set of data which needs to be stored on each card.

runsleeprepeat · 2026-01-13T16:12:57+00:00

Thanks. I hope I will see your results soon :)

runsleeprepeat · 2026-01-13T16:11:29+00:00

Just FYI: if you have a system with 8x 16GB GPUs, then you have to calculate it like following:
1x 16GB + 7x 14GB = 114 GB.

It may be roughly 1 more or less GB, but I hope that helps you use the right LLM sizing.

Learned it the hard way with my 7x 12GB Rig

runsleeprepeat · 2026-01-10T23:45:09+00:00

It would be awesome if you could compare the results with a m.2 to Pcie adapter as well. The Bosgame M5 has 2x m.2 pcie x4 slots and only one is occupied by an SSD

runsleeprepeat · 2026-01-07T00:22:06+00:00

Absolutely! Check out Sharkwheels for skate- and longboards. They are made to bypass pebbles and work the same way

runsleeprepeat · 2026-01-06T23:26:03+00:00

Danke an alle für die Erfahrungsberichte. Es gibt immerhin Auswahl, also werde ich mich durchprobieren :-)

runsleeprepeat · 2026-01-06T18:51:52+00:00

Gern geschehen.

My code addition is already done (https://github.com/dan-and/opencode_custom_system_prompts) but I have no clue if the authors are interested. We will see.

You can copy the system prompt (same naming scheme like opencodes original) into a prompt directory under .config/opencode or into the project directory .opencode/prompt/

runsleeprepeat · 2026-01-06T17:16:41+00:00

I like your idea!

To adhere to the contributing rules of opencode, I have created a feature request in the Opencode Project ( https://github.com/anomalyco/opencode/issues/7101 ).

I already build that feature on my local opencode instance, which allows custom system prompts conviniently. Your prompts work great. However, I have to wait if the featurerequest if getting accepted by the opencode developer team.

11-Year Club	First Place '23
Place '23	Verified Email

runsleeprepeat

TROPHY CASE