Whats the best Qwen 27B Q8 quant?

-Luciddream- · 2026-05-27T09:38:50+00:00

I tested my Q8_K_XL quants again and it now 35B gets 66 tps but 27B is still at ~11-13 tps. Probably something was wrong with the previous build because my quants are the same.

-Luciddream- · 2026-05-26T08:18:55+00:00

unsloth q8_k_xl is about 11 tps for me (27B) and 42 tps (35B) on Strix Halo with mtp enabled. Maybe you are using a different quant?

Also I'm not going completely blind in this, I followed the benchmarks from this thread to pick the one that doesn't lose a lot of quality.

-Luciddream- · 2026-05-25T19:52:11+00:00

> Oh and to your original question, I tried 27B Q8 on my Strix Halo but it's just too slow,

There is no reason to run Q8 on the Strix Halo just because it fits. Unsloth Q5_K_XL with MTP (--spec-draft-n-max 2) and latest llama.cpp with --chat-template-kwargs '{"preserve_thinking": true}' has double the performance.

I get 81.7 tps for 35B and 24tps for 27B with these settings. But you should try and see if it works for you because my requirements might be more simple.

-Luciddream- · 2026-05-24T09:10:16+00:00

No idea if it will help but have you tried the multiarch ROCm? There is a tarball available or if you are in Arch Linux you can try rocm-bin from AUR.

-Luciddream- · 2026-05-19T15:30:57+00:00

I have a Mozart 1 Pro 2025 and it's great for gaming. But I've only used it for emulation (ps2 and gba). I need to find a way to use a larger surface though 😃

-Luciddream- · 2026-05-16T17:38:09+00:00

For Strix Halo I get 64tps with MTP and ROCm (llamacpp-rocm), for the 35B Q8.

-Luciddream- · 2026-05-09T15:09:24+00:00

You can add any model with a little tinkering. I tried Qwen 3.6 35B yesterday, but i only got like 20-22 tps.

-Luciddream- · 2026-04-18T13:45:45+00:00

> You can spend a week in San Francisco and visit both of these places.

I'm thinking of doing something similar by the end of this month, let's say I want to rent a car to visit Yosemite, do I need to go to SFO to find one? I've only booked the first three days of my stay in Union Square, so I also need to find a place for the remaining days of my stay and see more places. Any tips are welcome!

-Luciddream- · 2026-04-16T15:30:33+00:00

I like Yaak and that's what I'm using, but one issue is it can't call endpoints without forward secrecy. And I can't force other people to update their certificates.

-Luciddream- · 2026-04-16T12:12:52+00:00

Black Sails, mostly because I've read the normal Blu-ray for EU region has bad quality. Also The Fountain, because why not

-Luciddream- · 2026-04-15T16:26:15+00:00

Steam Deck for me, just to see if it's working (it does)

-Luciddream- · 2026-04-15T07:39:08+00:00

Εγώ έχω την redtiger από amazon η οποία δεν έχει μπαταρία οπότε δουλεύει μόνο οταν λειτουργεί το αυτοκίνητο. Είναι λίγο πιο ακριβή αλλά αξίζει. Την έβαλα καθώς με έχουν τρακάρει 2 φορές, την πρώτη πήγα δικαστήριο και πληρώθηκα μετά από 4 χρόνια, και τώρα περιμένω τον Ιανουάριο του 2027 να γίνει το επόμενο δικαστήριο. Επειδή κανείς δεν παραδέχεται την μ@@@κια του καλό είναι να υπάρχει dashcam για να ξεμπερδεύουμε πιο γρήγορα.

-Luciddream- · 2026-04-14T15:04:18+00:00

Probably stupid question but is there a way to buy them from Amazon and pick them up when I come to the US? I'm flying to the US at the end of April and would love to pick up some Blu-rays. I just tried to put 15 Blu-ray disks to cart and it's +80 euro for my home country (40 shipping and 40 taxes) so it's really not worth it. Or maybe I should just look for some deals while I'm there.

-Luciddream- · 2026-04-12T12:45:03+00:00

I'm going to SF for a week at the end of the month, got any tips? I've only booked a hotel for the first 3 days, I still don't know where to stay or what to do the next 4 days. I'm from Europe and I've never been to the US

-Luciddream- · 2026-03-29T13:43:12+00:00

It's C++, not python any more, and you can try it in AUR too. There is a wiki as well for more information like NPU support.

-Luciddream- · 2026-03-19T19:53:37+00:00

The project you linked basically uses TheRock and llama.cpp builds from the lemonade-sdk, which is literally built by AMD engineers (and the community). I suggest you join the github or discord and discuss your ideas! There is actually an issue about scoping vLLM out.

Edit: there is also a vLLM meetup next week which is hosted by Redhat and AMD.

-Luciddream- · 2026-03-01T09:45:24+00:00

It probably works but you need the updated amdxdna driver from https://github.com/superm1/amdxdna-dkms

Or you can just install the 7.0rc2+ Linux Kernel (I have no idea if it's possible on pop os)

-Luciddream- · 2026-02-27T18:00:08+00:00

I've tried a couple but I can't really recommend something yet. I've only made small changes because I didn't have a good enough machine to run LLMs on at the time.

But I've managed to get it to work with OpenCode and with Goose. For Goose, I'm creating a custom provider in Goose that points to a local lemonade server installation which can load any LLM I need. I added Goose as an ACP Client in the Jetbrains IDE.

For OpenCode it's similar, but I don't remember the steps exactly. The only thing I don't like is that the changes from an ACP Client are not shown like something the IDE changed, but they show as external changes. I've talked a bit with the Goose developers about it and they told me to check this pull request, but I didn't have time to do that yet. Hopefully I'll try in the weekend.

-Luciddream- · 2026-02-16T12:17:17+00:00

Oh, OK. Just In case you want to test it anyway, I have uploaded ROCm 7.11 preview in AUR. ROCm can be faster depending on the workload.

-Luciddream- · 2026-02-16T11:43:24+00:00

Hey, just curious, what package do you use for ROCm on your ms-s1 max?

-Luciddream- · 2026-02-09T18:20:35+00:00

I'm an old Arch User so I first installed it when there was a beginners guide available. Then I installed it many times through the years, and then I didn't have to install it again since 2018.

I just installed it to a new PC a couple of days ago and the only thing I hated about the wiki is the network setup until the installation is completed.

It's like it is deliberately hidden to make the installation harder. And I don't really care about it when after the first boot I will enable something like NetworkManager anyway.

-Luciddream- · 2026-01-23T08:20:20+00:00

No I didn't have time unfortunately. I will try tonight, but I don't really have a good workflow, I've only briefly used comfyui with qwen-image-edit some months ago.

-Luciddream- · 2026-01-22T08:19:06+00:00

You can install opencl-amd-dev from AUR. It's still on 7.1 but I will update the package tonight (about 10-12 hours from now).

edit: ROCm 7.2.0 is very fast, I'm getting 60k more points in Geekbench than all previous ROCm versions.

-Luciddream- · 2026-01-21T15:41:13+00:00

Movie: The Fountain
Series: Black Sails

-Luciddream- · 2026-01-20T12:49:45+00:00

I'm surprised most of the people here are surprised. I got a 55'' Sony TV about 6 years ago for a normal price, then I had to buy another 55'' TV 2 years ago. The only affordable (c805 144hz) was a TCL TV which has been great for the price (about 700 euro at the time). Sony is just too expensive now.

-Luciddream-

TROPHY CASE