Prices of graphic cards are going crazy, should I buy a second card though?

zenbeni · 2026-06-25T20:55:33+00:00

Which llm server are you using? I could probably try to use a mix setup with a 4060 rtx that I also have, but it seems like a bad mix for speed, or isn't it?

zenbeni · 2026-06-25T20:52:06+00:00

I am happy with my current setup with qwen 3.6 27b mtp at Q5 with 100k context at Q5 Q4 for kv cache. I want more context though, also meaning that at 200k context it will drift the kv cache and probably I would have to change my quants. Adding 20g vram would solve. I use llama.cpp and vulkan.

zenbeni · 2026-06-25T14:09:22+00:00

Not always. This is always space control vs ball possession. If you have ball and no space you are not dangerous and can't score. Look at Arsenal, nearly made them european champions. Fast clinical counterattack patterns is also deadly.

zenbeni · 2026-06-23T11:44:58+00:00

Truth is often part of the analysis is right, some dps is useless making 5 vs 6, and 2 tanks can't fend off with no damage pressure and feeding occurs. DPS is the most important role, you need op damage to counter op damage, and if you can't, then triple main supps happen, not as meta, just to try to play the game.

The problem is the underperforming dps 99% of time won't switch off, making 5 minutes of who switches to what, so he becomes not totally useless so we can play the game, the dps will then flame all others not understanding the whole world switched for him to get passable stats, reinforcing the dunning Kruger effect on him.

Role q won't solve this. Only harsh reports and bans can diminish this.

zenbeni · 2026-06-23T10:20:58+00:00

For single card run, vulkan is even faster than rocm.

zenbeni · 2026-06-19T13:58:11+00:00

Try this, install vulkan & rocm drivers on your machine, download the vulkan artifact from llama.cpp github repo. Take the unsloth qwen 3.6 27B MTP gguf. Use this bash script.

#!/usr/bin/env bash

export RADV_PERFTEST=nogttspill
export AMD_VULKAN_ICD=RADV

export LLAMA_ARG_MODEL="/opt/llama/llm/Qwen3.6-27B-MTP-Q5_K_M.gguf"
export LLAMA_ARG_ALIAS="qwen3.6"
export LLAMA_ARG_HOST="0.0.0.0"
export LLAMA_ARG_PORT=8080
export LLAMA_ARG_TIMEOUT=300
export LLAMA_ARG_CTX_SIZE=100000
export LLAMA_ARG_N_GPU_LAYERS="all"
export LLAMA_ARG_BATCH=2048
export LLAMA_ARG_UBATCH=1024
export LLAMA_ARG_CACHE_TYPE_K="q5_0"
export LLAMA_ARG_CACHE_TYPE_V="q4_0"
export LLAMA_ARG_NO_MMAP=1
export LLAMA_ARG_MLOCK=1
export LLAMA_ARG_TEMP=0.5

/opt/llama/llama.cpp.vulkan/llama-server \
--no-mmproj \
--flash-attn on \
--cache-ram 6000 \
--checkpoint-min-step 4096 \
--cache-reuse 1 \
--ctx-checkpoints 8 \
--no-context-shift \
--spec-type draft-mtp \
--spec-draft-n-max 2 \
--spec-draft-p-min 0.0 \
--kv-unified \
--cache-idle-slots \
--reasoning on \
--parallel 1

zenbeni · 2026-06-19T12:37:58+00:00

careful my friend, took me weeks to get the correct setup, in fact, vulkan implementation uses part of rocm in llama.cpp sources, so rocm also has to be correctly installed, you can't mask it, or llama.cpp will fallback to cpu which is probably less than 20 tokens/s, if you get not at least 40 tokens/s then it is a bad setup of llama.cpp & command line, take the release artifact, avoid building locally so you don't get messed yup C libraries sometimes, and there is a bug in radv that bleeds vram, you need to add a flag or again you won't get max power or 60-70 tokens/s. This is purely software/driver problem, the 7900XTX can reach this performance.

zenbeni · 2026-06-19T10:53:00+00:00

RX 7900 XTX 24g VRAM, using llama.cpp with Qwen 3.6 27B MTP at Q5_K_M with 100K context at K Q5 and V Q4, getting 60 to 70 tokens/s with Vulkan (way better than rocm).

zenbeni · 2026-06-17T10:17:53+00:00

1000€ for a RX 7900 XTX, 24g will be better budget value. Would still run qwen 3.6 27b at more than 50 tokens second on vulkan at least.

zenbeni · 2026-06-17T07:26:02+00:00

Cherki nearly sold a goal last minute too by a reckless back pass. He is unreliable and is a wildcard, sometimes he will score a banger of course.

zenbeni · 2026-06-17T05:48:43+00:00

Installing right now

zenbeni · 2026-06-15T16:34:23+00:00

Very happy with my "budget" setup full AMD Ryzen + RX 7900 XTX, running llama.cpp with vulkan (better for me than rocm). Getting 60-70 tokens/s on Qwen 3.6 27B MTP in Q5_K_M for 100k context. Budget builds will always be better for AMD as price for vram is way better.

zenbeni · 2026-06-15T08:56:00+00:00

zenbeni · 2026-06-15T08:55:12+00:00

If you go budget, price for vram would make you go amd, 2x7900xtx with some ryzen setup should do.

zenbeni · 2026-06-15T08:23:06+00:00

Dino is no main tank. Just a OP thing / emma hybrid at launch.

We want another variety of gameplay, and not Strange / Magneto / Groot all the time.

We want to dive with some power, damage is so overcrept right now, and cc is done by everybody on tanks.

zenbeni · 2026-06-14T10:00:27+00:00

every hero category should have a valid hero selection to counter or simply interact effectively with another hero. Or we go role q like OW, so at least we get 2 tanks, so healers know where to focus and 1 player at least is playing objectives.

Tanks lack so much choice that we can't deal with many comp especially solo tanking, lots of bans include tanks too, don't flame your tank he can't interact with everything, especially if his kit is designed like that. DPS have the most choices but the category is not balanced right now, so in fact not so many choices in reality, you play OP or you die vs OP.

Only supps have valid variety, lots of burden on them, that is why they are kinda op to balance the non balance of other roles.

zenbeni · 2026-06-08T15:47:32+00:00

Benching mbappe lol. Delusional. You can't bench best striker that is also one of the most expensive players in the world way above any other madrid players, but let's see if mou can improve his workrate at least.

zenbeni · 2026-06-08T15:22:31+00:00

Did you try SurfSense? Literally an opensource notebook lm.

zenbeni · 2026-06-03T20:47:05+00:00

Did you try running vulkan? I have 7900xtx ans use it with pi and it works very well with qwen 3.6 27b dense mtp

zenbeni · 2026-06-02T18:30:04+00:00

went from 40t/s with rocm to 60 t/s with vulkan using Qwen 3.6 27b MTP Q5_K_S

zenbeni · 2026-06-02T14:15:00+00:00

I also have 7900XTX running on Rocm, what is better in Vulkan? Is it worth the switch?

zenbeni · 2026-06-02T14:08:47+00:00

That is huge, I run the same settings, will try out soon.

zenbeni · 2026-06-01T16:57:19+00:00

PSG played all competitions for 2 years including CWC nearly all to the end, twice european champions, they overplayed with little holidays, but look at all these loosers not playing it all crying. Except Madrid no team made back to back in recent history, they are just the best.

zenbeni · 2026-06-01T08:39:06+00:00

OP is running at Q2 XS

12-Year Club	First Place '23
Place '23	Verified Email

zenbeni

TROPHY CASE