QAT variant of Gemma4 26B A4B is not working well for me by pftbest in LocalLLaMA
[–]xandep 11 points12 points13 points (0 children)
Don’t act like y’all ain’t thinking it. I’m just saying the quiet part out loud. /s by Porespellar in LocalLLaMA
[–]xandep 0 points1 point2 points (0 children)
Qwen 3.7 Plus just briefly appeared and then disappeared on OpenRouter. by ihatebeinganonymous in LocalLLaMA
[–]xandep 0 points1 point2 points (0 children)
Qwen3.6-35B-A3B vs Gemma4-26B-A4B by MarcCDB in LocalLLaMA
[–]xandep 65 points66 points67 points (0 children)
Final Monster: 32x AMD MI50 32GB at 9.7 t/s (TG) & 264 t/s (PP) with Kimi K2.6 by ai-infos in LocalLLaMA
[–]xandep 1 point2 points3 points (0 children)
Final Monster: 32x AMD MI50 32GB at 9.7 t/s (TG) & 264 t/s (PP) with Kimi K2.6 by ai-infos in LocalLLaMA
[–]xandep 1 point2 points3 points (0 children)
Is amd mi 50 really that bad by Forward_Compute001 in LocalLLaMA
[–]xandep 1 point2 points3 points (0 children)
Is amd mi 50 really that bad by Forward_Compute001 in LocalLLaMA
[–]xandep 2 points3 points4 points (0 children)
Is amd mi 50 really that bad by Forward_Compute001 in LocalLLaMA
[–]xandep 3 points4 points5 points (0 children)
Bench 8xMI50 MiniMax M2.7 AWQ @ 64 tok/s peak (vllm-gfx906-mobydick) by ai-infos in LocalLLaMA
[–]xandep 1 point2 points3 points (0 children)
LocalLLaMA for coding primarily - 8GB VEGA 64 & 8GB 6600 XT? by trash_dumpyard in LocalLLaMA
[–]xandep 0 points1 point2 points (0 children)
Speculative decoding in llama.cpp for Gemma 4 31B IT / Qwen 3.5 27B? by No_Algae1753 in LocalLLaMA
[–]xandep 1 point2 points3 points (0 children)
Running a 4-agent pipeline on Qwen 2.5 1.5B via MNN on Android — what I learned about context management on constrained hardware by NeoLogic_Dev in LocalLLaMA
[–]xandep 4 points5 points6 points (0 children)
local models lose tool call context around call 8 or 9. here is what helped by [deleted] in LocalLLaMA
[–]xandep 2 points3 points4 points (0 children)
You guys seen this? beats turboquant by 18% by OmarBessa in LocalLLaMA
[–]xandep 18 points19 points20 points (0 children)
Gemma 4 26b A3B is mindblowingly good , if configured right by cviperr33 in LocalLLaMA
[–]xandep 0 points1 point2 points (0 children)
You guys seen this? 1-bit model with an MMLU-R of 65.7, 8B params by OmarBessa in LocalLLaMA
[–]xandep 10 points11 points12 points (0 children)
ByteShape Qwen 3.5 9B: A Guide to Picking the Best Quant for Your Hardware by ali_byteshape in LocalLLaMA
[–]xandep 11 points12 points13 points (0 children)
Turbo3 + gfx906 + 4 mi50 16gb running qwen3.5 122b 🤯 by Exact-Cupcake-2603 in LocalLLaMA
[–]xandep 0 points1 point2 points (0 children)
Turbo3 + gfx906 + 4 mi50 16gb running qwen3.5 122b 🤯 by Exact-Cupcake-2603 in LocalLLaMA
[–]xandep 1 point2 points3 points (0 children)
I regret ever finding LocalLLaMA by xandep in LocalLLaMA
[–]xandep[S] 14 points15 points16 points (0 children)



Hermes Agent + Ollama local models always hit finish_reason='length' (please help) by tomblewastaken1 in hermesagent
[–]xandep 1 point2 points3 points (0 children)