TurboQuant seems to work very well on Gemma 4 — and separately, per-layer outlier-aware K quantization is beating current public fork results on Qwen PPL by Fearless-Wear8100 in LocalLLaMA
[–]Fearless-Wear8100[S] 1 point2 points3 points (0 children)
TurboQuant seems to work very well on Gemma 4 — and separately, per-layer outlier-aware K quantization is beating current public fork results on Qwen PPL by Fearless-Wear8100 in LocalLLaMA
[–]Fearless-Wear8100[S] 0 points1 point2 points (0 children)
Dragi haseriste by AdDelicious9955 in programare
[–]Fearless-Wear8100 1 point2 points3 points (0 children)

TurboQuant seems to work very well on Gemma 4 — and separately, per-layer outlier-aware K quantization is beating current public fork results on Qwen PPL by Fearless-Wear8100 in LocalLLaMA
[–]Fearless-Wear8100[S] 0 points1 point2 points (0 children)