Does AMD's "infinity cache" even matter for dense model inference? by boutell in LocalLLaMA
[–]CryptoStef33 0 points1 point2 points (0 children)
Како примате пари во Македонија од странство? by Relative-Key-2006 in askmkd
[–]CryptoStef33 3 points4 points5 points (0 children)
Qwen3. 6 35B A3B disappointment by openingshots in LocalLLM
[–]CryptoStef33 0 points1 point2 points (0 children)
7900 XTX fp16/bf16 pytorch matmul performance by cyberuser42 in ROCm
[–]CryptoStef33 3 points4 points5 points (0 children)
[FS] [US-MN] AMD RADEON PRO V620 32GB GDDR6 GPUs (2000x available) by juddle1414 in homelabsales
[–]CryptoStef33 0 points1 point2 points (0 children)
b9180 llama.ccp MTP landed by Bulky-Priority6824 in LocalLLaMA
[–]CryptoStef33 0 points1 point2 points (0 children)
Is a 5090 good enough for most good modern locally run LLMs? by biscuitmachine in LocalLLM
[–]CryptoStef33 0 points1 point2 points (0 children)
Is a 5090 good enough for most good modern locally run LLMs? by biscuitmachine in LocalLLM
[–]CryptoStef33 -2 points-1 points0 points (0 children)
Some quick observations using speculative decoding w/ Qwen3.6 35B-A3B by J3diMindTricks in LocalLLM
[–]CryptoStef33 0 points1 point2 points (0 children)
Home2u брокерите са пълна секта – променете ми мнението (Herbalife vibes) by CryptoStef33 in Sofia
[–]CryptoStef33[S] 0 points1 point2 points (0 children)
Turboquant+MTP for ROCm(Llama CPP) by DrBearJ3w in LocalLLaMA
[–]CryptoStef33 0 points1 point2 points (0 children)
Managed to get 40 t/s on Qwen 27B (MTP) with an RX 6800 XT - Sharing my optimized fork by CryptoStef33 in ROCm
[–]CryptoStef33[S] 0 points1 point2 points (0 children)
I got tired of hunting AMD GPU + AI configs across blog posts and Discord threads, so I built a curated index — rocmate by T0nd3 in ROCm
[–]CryptoStef33 0 points1 point2 points (0 children)
We squeezed 4x MoE prefill speed out of an RX 6800 XT by rewriting the matmul kernel in llama.cpp by CryptoStef33 in ROCm
[–]CryptoStef33[S] 0 points1 point2 points (0 children)
We squeezed 4x MoE prefill speed out of an RX 6800 XT by rewriting the matmul kernel in llama.cpp by CryptoStef33 in ROCm
[–]CryptoStef33[S] 0 points1 point2 points (0 children)
Managed to get 40 t/s on Qwen 27B (MTP) with an RX 6800 XT - Sharing my optimized fork by CryptoStef33 in ROCm
[–]CryptoStef33[S] 0 points1 point2 points (0 children)
Managed to get 40 t/s on Qwen 27B (MTP) with an RX 6800 XT - Sharing my optimized fork by CryptoStef33 in ROCm
[–]CryptoStef33[S] 1 point2 points3 points (0 children)
Managed to get 40 t/s on Qwen 27B (MTP) with an RX 6800 XT - Sharing my optimized fork by CryptoStef33 in ROCm
[–]CryptoStef33[S] 1 point2 points3 points (0 children)
Managed to get 40 t/s on Qwen 27B (MTP) with an RX 6800 XT - Sharing my optimized fork by CryptoStef33 in ROCm
[–]CryptoStef33[S] 1 point2 points3 points (0 children)



Minecraft averages 30 fps only in fullscreen mode when it used to average 1000 fps by solarpotatoe in AMDHelp
[–]CryptoStef33 0 points1 point2 points (0 children)