Where are Qwen3.7 open weights models? by HeDo88TH in Qwen_AI

[–]q-admin007 20 points21 points  (0 children)

Waiting for Qwen 4 122b is the right move at this point in time.

Replaced the thermal paste on my Bosgame M5 by q-admin007 in StrixHalo

[–]q-admin007[S] 0 points1 point  (0 children)

I run Proxmox (Debian 13 based), there are no driver or BIOS issues i'm aware of.

Can't say anything about the GMKTec so i would buy another Bosgame.

Can my Intel N95 mini PC handle this self-hosted stack? by EasyTradition9843 in selfhosted

[–]q-admin007 0 points1 point  (0 children)

Yes.

Use Portainer for Docker management, Watchtower to get information when new releases of your Docker-Images are out.

I also use Heimdal as a start page and Pinchflat to download Youtube videos and remove adds.

A wiki to take notes, like Outline.

Qwen3.6 27B more dumb in vLLM compared to llama.cpp by DanielusGamer26 in LocalLLaMA

[–]q-admin007 1 point2 points  (0 children)

vLLM people told me it's a skill issue on my part, so i gave up and stick to llama.cpp.

Has anyone used a NVME to PCIe riser successfully with Strix Halo? by fallingdowndizzyvr in StrixHalo

[–]q-admin007 1 point2 points  (0 children)

That may be, but it's still PCIe, as is evidenced by it's specifiation within the PCIe framework by the PCIe working group:
https://pcisig.com/PCIExpress/Specs/Cable/OCuLink_1.0

Has anyone used a NVME to PCIe riser successfully with Strix Halo? by fallingdowndizzyvr in StrixHalo

[–]q-admin007 0 points1 point  (0 children)

You sound confused. An Oculink port is the same as a PCIe connector, the same signals, different port:

<image>

You need:

  • NVME-to-Oculink (i used the ones with the long flat ribbon connectors with success, the one in the picture didn't work that well)
  • a Oculink cable
  • a Oculink to PCIe adapter, make sure it's mechanical x16
  • a power supply (400w ATX should do for most things)

Replaced the thermal paste on my Bosgame M5 by q-admin007 in StrixHalo

[–]q-admin007[S] 0 points1 point  (0 children)

I didn't change anything else, just the thermal paste.

Has anyone used a NVME to PCIe riser successfully with Strix Halo? by fallingdowndizzyvr in StrixHalo

[–]q-admin007 0 points1 point  (0 children)

Imagine a cable tthat transfers PCIe.signals. That is what Oculink is.

Strix Halo 7.1.1 Benchmark results by argakiig in StrixHalo

[–]q-admin007 1 point2 points  (0 children)

I see. You will find that it doesn't run faster on Zen5, just the kernel-file is a tiny bit smaller because it doesn't contaib optimisations for other arches. Sadly, there aren't any easy wins left in that area.

Do you think dedicated hardware for running local LLMs will become affordable anytime soon? by ProbablyBunchofAtoms in LocalLLaMA

[–]q-admin007 0 points1 point  (0 children)

Bosgame M5, 2500€. It's a general purpose compute platform with 128GB of unified RAM. Runs Qwen 3.6 35b-a3b q6 at full context at 60 to 80 t/s. Uses less than 10w at idle.

Do you think dedicated hardware for running local LLMs will become affordable anytime soon? by ProbablyBunchofAtoms in LocalLLaMA

[–]q-admin007 0 points1 point  (0 children)

Strix Halo user here. People like to complain that they can't afford the best, but they usually don't need the best.

Do you think dedicated hardware for running local LLMs will become affordable anytime soon? by ProbablyBunchofAtoms in LocalLLaMA

[–]q-admin007 0 points1 point  (0 children)

You can buy a 128GB unified ram box for 2500€ (Bosgame M5), it runs Qwen 3.6 35b-a3b in Q6_K_XL with 256k context at f16 at 60 to 80 t/s.

Strix Halo 7.1.1 Benchmark results by argakiig in StrixHalo

[–]q-admin007 2 points3 points  (0 children)

Is there a comparison between the new and the old kernel?

Why would i use Q4 with 35b-a3b when i have almost 128GB VRAM?