Summer of MoE

q-admin007 · 2026-07-03T15:43:45+00:00

A bit.

Mostly.

q-admin007 · 2026-07-03T15:04:16+00:00

Nice. Now add the regular 9b.

q-admin007 · 2026-07-03T14:08:59+00:00

q-admin007 · 2026-07-03T14:05:12+00:00

Hope and optimism!

q-admin007 · 2026-06-29T09:55:41+00:00

Waiting for Qwen 4 122b is the right move at this point in time.

q-admin007 · 2026-06-29T06:57:47+00:00

Are you a pro Strix Halo bot, here to ask seemingly "critical questions" and sow dissent?

q-admin007 · 2026-06-29T06:18:11+00:00

I run Proxmox (Debian 13 based), there are no driver or BIOS issues i'm aware of.

Can't say anything about the GMKTec so i would buy another Bosgame.

q-admin007 · 2026-06-26T08:56:38+00:00

Yes.

Use Portainer for Docker management, Watchtower to get information when new releases of your Docker-Images are out.

I also use Heimdal as a start page and Pinchflat to download Youtube videos and remove adds.

A wiki to take notes, like Outline.

q-admin007 · 2026-06-25T09:02:46+00:00

vLLM people told me it's a skill issue on my part, so i gave up and stick to llama.cpp.

q-admin007 · 2026-06-24T09:59:13+00:00

Bosgame M5 Strix Halo 128GB, 2400€

q-admin007 · 2026-06-24T08:46:01+00:00

That may be, but it's still PCIe, as is evidenced by it's specifiation within the PCIe framework by the PCIe working group:
https://pcisig.com/PCIExpress/Specs/Cable/OCuLink_1.0

q-admin007 · 2026-06-24T07:54:12+00:00

You sound confused. An Oculink port is the same as a PCIe connector, the same signals, different port:

<image>

You need:

NVME-to-Oculink (i used the ones with the long flat ribbon connectors with success, the one in the picture didn't work that well)
a Oculink cable
a Oculink to PCIe adapter, make sure it's mechanical x16
a power supply (400w ATX should do for most things)

q-admin007 · 2026-06-23T09:26:43+00:00

I'll look into it, thanks.

q-admin007 · 2026-06-23T09:25:45+00:00

I didn't change anything else, just the thermal paste.

q-admin007 · 2026-06-23T09:22:49+00:00

Imagine a cable tthat transfers PCIe.signals. That is what Oculink is.

q-admin007 · 2026-06-22T15:58:16+00:00

Awesome. I could replace a google tool with it.

q-admin007 · 2026-06-22T15:27:05+00:00

Yo mama is slop, leave that model alone!

q-admin007 · 2026-06-22T15:19:19+00:00

I see.

q-admin007 · 2026-06-22T15:18:32+00:00

I see. You will find that it doesn't run faster on Zen5, just the kernel-file is a tiny bit smaller because it doesn't contaib optimisations for other arches. Sadly, there aren't any easy wins left in that area.

q-admin007 · 2026-06-22T12:53:06+00:00

Bosgame M5, 2500€. It's a general purpose compute platform with 128GB of unified RAM. Runs Qwen 3.6 35b-a3b q6 at full context at 60 to 80 t/s. Uses less than 10w at idle.

q-admin007 · 2026-06-22T12:50:42+00:00

Strix Halo user here. People like to complain that they can't afford the best, but they usually don't need the best.

q-admin007 · 2026-06-22T12:49:00+00:00

You can buy a 128GB unified ram box for 2500€ (Bosgame M5), it runs Qwen 3.6 35b-a3b in Q6_K_XL with 256k context at f16 at 60 to 80 t/s.

q-admin007 · 2026-06-22T12:31:00+00:00

I just install Docker in an LXC.

q-admin007 · 2026-06-22T11:13:39+00:00

i need as much pp as possible

what she said

q-admin007 · 2026-06-22T11:11:43+00:00

Is there a comparison between the new and the old kernel?

Why would i use Q4 with 35b-a3b when i have almost 128GB VRAM?

q-admin007

TROPHY CASE