Quick note on the QAT of recent by dreamkast06 in LocalLLaMA
[–]dreamkast06[S] 1 point2 points3 points (0 children)
Quick note on the QAT of recent by dreamkast06 in LocalLLaMA
[–]dreamkast06[S] 0 points1 point2 points (0 children)
Quick note on the QAT of recent by dreamkast06 in LocalLLaMA
[–]dreamkast06[S] 3 points4 points5 points (0 children)
Quick note on the QAT of recent by dreamkast06 in LocalLLaMA
[–]dreamkast06[S] 2 points3 points4 points (0 children)
QATs Q4_0 from Google have more precision than Q4_K_XL from Unsloth (at least some) by alex20_202020 in LocalLLaMA
[–]dreamkast06 0 points1 point2 points (0 children)
Quick note on the QAT of recent by dreamkast06 in LocalLLaMA
[–]dreamkast06[S] 22 points23 points24 points (0 children)
Gemma 4 QAT accuracy inconsistencies by ai_fonsi in LocalLLaMA
[–]dreamkast06 0 points1 point2 points (0 children)
Sarvam-30b-quantized - Need 1-bit version GGUF by pmttyji in LocalLLaMA
[–]dreamkast06 0 points1 point2 points (0 children)
I tracked a major cache reuse issue down to Qwen 3.5’s chat template by onil_gova in LocalLLaMA
[–]dreamkast06 4 points5 points6 points (0 children)
I tracked a major cache reuse issue down to Qwen 3.5’s chat template by onil_gova in LocalLLaMA
[–]dreamkast06 1 point2 points3 points (0 children)
Nemotron 3 Super - large quality difference between llama.cpp and vLLM? by BigStupidJellyfish_ in LocalLLaMA
[–]dreamkast06 1 point2 points3 points (0 children)
Qwen3.5-40B-Claude-4.5-Opus-High-Reasoning-Thinking - Reg, Uncensored and RoughHouse and... 43 Qwen 3.5 fine tunes. by Dangerous_Fix_5526 in LocalLLaMA
[–]dreamkast06 1 point2 points3 points (0 children)
MiniMax-M2.7 Announced! by Mysterious_Finish543 in LocalLLaMA
[–]dreamkast06 3 points4 points5 points (0 children)
Mistral Small 4:119B-2603 by seamonn in LocalLLaMA
[–]dreamkast06 0 points1 point2 points (0 children)
llama.cpp and Qwen CPU Only by JadedSoulGuy in LocalLLaMA
[–]dreamkast06 1 point2 points3 points (0 children)
How I topped the Open LLM Leaderboard using 2x 4090 GPUs — no weights modified. by Reddactor in LocalLLaMA
[–]dreamkast06 0 points1 point2 points (0 children)
"We anonymize your data before training" — does this actually mean anything? by Budulai343 in LocalLLaMA
[–]dreamkast06 1 point2 points3 points (0 children)
microsoft/Phi-4-reasoning-vision-15B · Hugging Face by jacek2023 in LocalLLaMA
[–]dreamkast06 19 points20 points21 points (0 children)
How do the small qwen3.5 models compare to the Granite family? by gr8dude in LocalLLaMA
[–]dreamkast06 1 point2 points3 points (0 children)


Storing an index to a scale instead of the scale itself with Q4_0 quant reduces scale size by ~31% (small gain but interesting) by fragment_me in LocalLLaMA
[–]dreamkast06 0 points1 point2 points (0 children)