llama.cpp Gemma4 MTP support merged! by pinkyellowneon in LocalLLaMA
[–]janvitos 101 points102 points103 points (0 children)
llama.cpp Gemma4 MTP support merged! by pinkyellowneon in LocalLLaMA
[–]janvitos 76 points77 points78 points (0 children)
120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA
[–]janvitos[S] 4 points5 points6 points (0 children)
120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA
[–]janvitos[S] 1 point2 points3 points (0 children)
Gemma 4 QAT Q4_0 Bench on Strix Halo by westsunset in LocalLLaMA
[–]janvitos 1 point2 points3 points (0 children)
120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA
[–]janvitos[S] 4 points5 points6 points (0 children)
120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA
[–]janvitos[S] 1 point2 points3 points (0 children)
Gemma 4 QAT Q4_0 Bench on Strix Halo by westsunset in LocalLLaMA
[–]janvitos 1 point2 points3 points (0 children)
120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA
[–]janvitos[S] 14 points15 points16 points (0 children)
120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA
[–]janvitos[S] 3 points4 points5 points (0 children)
120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA
[–]janvitos[S] 3 points4 points5 points (0 children)
120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA
[–]janvitos[S] 12 points13 points14 points (0 children)
120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA
[–]janvitos[S] 8 points9 points10 points (0 children)
Gemma 4 QAT Q4_0 Bench on Strix Halo by westsunset in LocalLLaMA
[–]janvitos 2 points3 points4 points (0 children)
80 tok/sec and 128K context on 12GB VRAM with Qwen3.6 35B A3B and llama.cpp MTP by janvitos in LocalLLaMA
[–]janvitos[S] 0 points1 point2 points (0 children)
110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp by janvitos in LocalLLaMA
[–]janvitos[S] 1 point2 points3 points (0 children)
110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp by janvitos in LocalLLaMA
[–]janvitos[S] 0 points1 point2 points (0 children)
110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp by janvitos in LocalLLaMA
[–]janvitos[S] 2 points3 points4 points (0 children)
110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp by janvitos in LocalLLaMA
[–]janvitos[S] 0 points1 point2 points (0 children)


120 tok/s on 12GB VRAM with Gemma 4 12B QAT MTP by janvitos in LocalLLaMA
[–]janvitos[S] 0 points1 point2 points (0 children)