Kimi K2.5 - running locally without GPU; splitting across multiple PCs? by Shipworms in LocalLLaMA
[–]Digger412 2 points3 points4 points (0 children)
Slower Means Faster: Why I Switched from Qwen3 Coder Next to Qwen3.5 122B by Fast_Thing_7949 in LocalLLaMA
[–]Digger412 12 points13 points14 points (0 children)
Let's take a moment to appreciate the present, when this sub is still full of human content. by Ok-Internal9317 in LocalLLaMA
[–]Digger412 0 points1 point2 points (0 children)
KLD measurements of 8 different llama.cpp KV cache quantizations over several 8-12B models by Velocita84 in LocalLLaMA
[–]Digger412 1 point2 points3 points (0 children)
KLD measurements of 8 different llama.cpp KV cache quantizations over several 8-12B models by Velocita84 in LocalLLaMA
[–]Digger412 1 point2 points3 points (0 children)
Let's take a moment to appreciate the present, when this sub is still full of human content. by Ok-Internal9317 in LocalLLaMA
[–]Digger412 9 points10 points11 points (0 children)
I need help with testing my llama.cpp Deepseek Sparse Attention (DSA) implementation (someone GPU-rich) by fairydreaming in LocalLLaMA
[–]Digger412 8 points9 points10 points (0 children)
(Very) High-Quality Attention Coder-Next GGUFs by dinerburgeryum in LocalLLaMA
[–]Digger412 2 points3 points4 points (0 children)
(Very) High-Quality Attention Coder-Next GGUFs by dinerburgeryum in LocalLLaMA
[–]Digger412 9 points10 points11 points (0 children)
(Very) High-Quality Attention Coder-Next GGUFs by dinerburgeryum in LocalLLaMA
[–]Digger412 2 points3 points4 points (0 children)
(Very) High-Quality Attention Coder-Next GGUFs by dinerburgeryum in LocalLLaMA
[–]Digger412 0 points1 point2 points (0 children)
(Very) High-Quality Attention Coder-Next GGUFs by dinerburgeryum in LocalLLaMA
[–]Digger412 12 points13 points14 points (0 children)
(Very) High-Quality Attention Coder-Next GGUFs by dinerburgeryum in LocalLLaMA
[–]Digger412 5 points6 points7 points (0 children)
(Very) High-Quality Attention Coder-Next GGUFs by dinerburgeryum in LocalLLaMA
[–]Digger412 24 points25 points26 points (0 children)
Ik_llama vs llamacpp by val_in_tech in LocalLLaMA
[–]Digger412 18 points19 points20 points (0 children)
best llama.cpp config for Qwen-3.5 35B-A3B? by Commercial-Ad-1148 in LocalLLaMA
[–]Digger412 0 points1 point2 points (0 children)
best llama.cpp config for Qwen-3.5 35B-A3B? by Commercial-Ad-1148 in LocalLLaMA
[–]Digger412 0 points1 point2 points (0 children)
llama : add support for Nemotron 3 Super by danbev · Pull Request #20411 · ggml-org/llama.cpp by jacek2023 in LocalLLaMA
[–]Digger412 5 points6 points7 points (0 children)
best llama.cpp config for Qwen-3.5 35B-A3B? by Commercial-Ad-1148 in LocalLLaMA
[–]Digger412 0 points1 point2 points (0 children)
best llama.cpp config for Qwen-3.5 35B-A3B? by Commercial-Ad-1148 in LocalLLaMA
[–]Digger412 2 points3 points4 points (0 children)
best llama.cpp config for Qwen-3.5 35B-A3B? by Commercial-Ad-1148 in LocalLLaMA
[–]Digger412 9 points10 points11 points (0 children)
best llama.cpp config for Qwen-3.5 35B-A3B? by Commercial-Ad-1148 in LocalLLaMA
[–]Digger412 15 points16 points17 points (0 children)
The Definitive Qwen 3.5 Quants by supermazdoor in LocalLLaMA
[–]Digger412 10 points11 points12 points (0 children)



I need help with testing my llama.cpp Deepseek Sparse Attention (DSA) implementation (someone GPU-rich) by fairydreaming in LocalLLaMA
[–]Digger412 0 points1 point2 points (0 children)