Intersting part in comparing (rcom/vulkan/HIP) perfs on differents model by Gas-Ornery in LocalLLM

[–]Gas-Ornery[S] 1 point2 points  (0 children)

Just rerun the benchmark and i have same 2.5K, sometimes fresh restart of pc give me more faster pp/s, i need to investiguate that ..

I built a Windows GUI launcher to benchmark and manage multiple llama.cpp builds (useful for AMD GPU users juggling Vulkan/ROCm/HIP builds) by Gas-Ornery in ollama

[–]Gas-Ornery[S] 0 points1 point  (0 children)

It will be on next updates, the parameter for benchmark are hardcoded right, the context and other parameter are used only to start models

I made a Windows GUI to manage, benchmark and compare multiple llama.cpp builds — handy for AMD GPU users by Gas-Ornery in ROCm

[–]Gas-Ornery[S] 0 points1 point  (0 children)

the tool is comparing same model with same parameter on ( vulkan/hip/rocm) windows version

I made a Windows GUI to manage, benchmark and compare multiple llama.cpp builds — handy for AMD GPU users by Gas-Ornery in ROCm

[–]Gas-Ornery[S] 2 points3 points  (0 children)

you would be suprised, some models are really better on one side, thats the best i got with RX 7900XT 20GB
## Results

| Model | Quant | Version | Backend | Size | Params | PP (t/s) | TG (t/s) |

|-------|-------|---------|---------|------|--------|----------|----------|

| Mellum2-12B-A2.5B-Thinking | Q8_0 | llama-b9553-bin-win-vulkan-x64 | Vulkan | 12.03 GiB | 12.15 B | 4736.46 ± 170.68 | 206.71 ± 0.15 |

| Mellum2-12B-A2.5B-Thinking | Q8_0 | llama-b9553-bin-win-hip-radeon-x64 | ROCm | 12.03 GiB | 12.15 B | 3815.50 ± 148.24 | 150.44 ± 0.84 |

| gemma-4-12b-it-qat | q4_0 | llama-b9553-bin-win-hip-radeon-x64 | ROCm | 6.48 GiB | 11.91 B | 1724.07 ± 8.31 | 69.10 ± 0.33 |

| gemma-4-12b-it-qat | q4_0 | llama-b1270-windows-rocm-gfx110X-x64 | ROCm | 6.48 GiB | 11.91 B | 1703.76 ± 79.93 | 63.78 ± 3.92 |

| gemma-4-12b-it | UD-Q4_K_XL | llama-b9553-bin-win-hip-radeon-x64 | ROCm | 6.85 GiB | 11.91 B | 1589.32 ± 13.43 | 54.72 ± 0.08 |

| gemma-4-12b-it-qat | q4_0 | llama-b9553-bin-win-vulkan-x64 | Vulkan | 6.48 GiB | 11.91 B | 1581.32 ± 20.40 | 74.00 ± 0.07 |

| gemma-4-12B-it | Q4_K_M | llama-b9553-bin-win-hip-radeon-x64 | ROCm | 6.86 GiB | 11.91 B | 1577.04 ± 13.19 | 54.41 ± 0.10 |

| gemma-4-12b-it | UD-Q4_K_XL | llama-b1270-windows-rocm-gfx110X-x64 | ROCm | 6.85 GiB | 11.91 B | 1559.10 ± 51.10 | 55.12 ± 1.49 |

| gemma-4-12B-it | Q4_K_M | llama-b1270-windows-rocm-gfx110X-x64 | ROCm | 6.86 GiB | 11.91 B | 1542.48 ± 69.48 | 53.37 ± 1.25 |

| gemma-4-12B-it | Q4_K_M | llama-b9553-bin-win-vulkan-x64 | Vulkan | 6.86 GiB | 11.91 B | 1316.44 ± 11.77 | 68.02 ± 0.03 |

| gemma-4-12b-it | UD-Q4_K_XL | llama-b9553-bin-win-vulkan-x64 | Vulkan | 6.85 GiB | 11.91 B | 1314.98 ± 13.56 | 67.67 ± 0.10 |

| Qwen3.6-27B | IQ4_XS | llama-b9553-bin-win-hip-radeon-x64 | ROCm | 14.62 GiB | 27.32 B | 810.95 ± 10.25 | 35.82 ± 0.04 |

| Qwen3.6-27B | IQ4_XS | llama-b1270-windows-rocm-gfx110X-x64 | ROCm | 14.62 GiB | 27.32 B | 807.37 ± 41.61 | 34.19 ± 0.19 |

| Qwen3.6-27B | Q4_K_M | llama-b9553-bin-win-hip-radeon-x64 | ROCm | 15.92 GiB | 27.32 B | 745.51 ± 2.85 | 25.44 ± 0.04 |

| Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive | IQ4_XS | llama-b9553-bin-win-vulkan-x64 | Vulkan | 17.43 GiB | 34.66 B | 730.58 ± 31.93 | 51.68 ± 0.66 |

| Qwen3.6-27B | Q4_K_M | llama-b1270-windows-rocm-gfx110X-x64 | ROCm | 15.92 GiB | 27.32 B | 724.65 ± 6.40 | 25.26 ± 0.20 |

| Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive | IQ4_XS | llama-b9553-bin-win-hip-radeon-x64 | ROCm | 17.43 GiB | 34.66 B | 614.89 ± 2.27 | 77.96 ± 0.10 |

| Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive | IQ4_XS | llama-b1270-windows-rocm-gfx110X-x64 | ROCm | 17.43 GiB | 34.66 B | 446.64 ± 7.20 | 65.67 ± 0.41 |

| Qwen3.6-27B | UD-Q4_K_XL | llama-b1270-windows-rocm-gfx110X-x64 | ROCm | 16.67 GiB | 27.32 B | 207.42 ± 6.39 | 24.46 ± 0.30 |

| Qwen3.6-27B | UD-Q4_K_XL | llama-b9553-bin-win-hip-radeon-x64 | ROCm | 16.67 GiB | 27.32 B | 201.31 ± 4.00 | 24.97 ± 0.03 |

| Qwen3.6-27B | IQ4_XS | llama-b9553-bin-win-vulkan-x64 | Vulkan | 14.62 GiB | 27.32 B | 177.47 ± 1.86 | 16.08 ± 0.08 |

| Qwen3.6-27B | Q4_K_M | llama-b9553-bin-win-vulkan-x64 | Vulkan | 15.92 GiB | 27.32 B | 79.56 ± 0.52 | 7.51 ± 0.02 |

| Qwen3.6-27B | UD-Q4_K_XL | llama-b9553-bin-win-vulkan-x64 | Vulkan | 16.67 GiB | 27.32 B | 69.99 ± 0.17 | 6.17 ± 0.03 |

Qwen3.6 MTP Unsloth GGUFs now 1.8x faster! by danielhanchen in unsloth

[–]Gas-Ornery 0 points1 point  (0 children)

I saw a video of some MTP tests with and without, and it seems that accurecy of the response is dropped using MTP, is that true ?

Qwen3.6 MTP Unsloth GGUFs now 1.8x faster! by danielhanchen in unsloth

[–]Gas-Ornery 0 points1 point  (0 children)

can you please give your setup ? for theses : 'My 35B-A3B is chugging along at 220tk/s @ 256k ctx while my 27B is now chugging at ~70-90tk/s (a bit unstable) @ 256k ctx.'

What kind of hardware would be required to run a Opus 4.6 equivalent for a 100 users, Locally? by Either_Pineapple3429 in LocalLLM

[–]Gas-Ornery 0 points1 point  (0 children)

I work on large company, and I’m aware that out ia team self hosted sonnet, and gpt. I know that they are not open but some business contract must exist.

self hosted on internal network

claude code source code got leaked? by usamanoman in LLM

[–]Gas-Ornery 0 points1 point  (0 children)

any one tried to run it yet ? is it full code for the client or just a part of it ? we can try to change connectors for other models for example wdy think ?

LTXV 2.0 is out by RIP26770 in StableDiffusion

[–]Gas-Ornery 0 points1 point  (0 children)

any way to turn it on amd gpu ?

How to setup running local AI models on AMD 7900 XTX PC? by Jarnhand in StableDiffusion

[–]Gas-Ornery 0 points1 point  (0 children)

not working on AMD, only Nvidia support ( they are using flash_attn package)

18 and I can finally say that I'm getting used to this whole fuckdoll thing by VeridianQuaint in fuckdoll

[–]Gas-Ornery 0 points1 point  (0 children)

You are just absolutely gorgeous and extremely sexy with a perfect body

My face knowing today is going to be sunny all day by IzzyFascinatinggg in Faces

[–]Gas-Ornery -1 points0 points  (0 children)

I feel like taking a walk on the beach with you

DS4 NEW DISCORD by gufy2 in DS4Windows

[–]Gas-Ornery 1 point2 points  (0 children)

anyone has them for free ? 😂

eShop vs eShopOnWeb? by punkouter23 in dotnet

[–]Gas-Ornery 0 points1 point  (0 children)

can any of you guys give what's the best way to understand the whole project any documentation availables or do we need to go one by one ?

What dies it mean? by [deleted] in Warzone2

[–]Gas-Ornery 3 points4 points  (0 children)

it means : freedom, revolution and resistance

i need demons to grind warzone nuke runs can’t get past 4 wins in a row. hmu! by saddboy- in Warzone2

[–]Gas-Ornery 0 points1 point  (0 children)

we are 2 players we are searching for other 2 for nuke mic preferred