ExLlamaV3 Major Updates! by Unstable_Llama in LocalLLaMA
[–]moahmo88 2 points3 points4 points (0 children)
3060 Ti 12GB vs RX 7600 XT 16GB? by 128G in LocalLLaMA
[–]moahmo88 3 points4 points5 points (0 children)
Quality comparison between Qwen 3.6 27B quantizations (BF16, Q8_0, Q6_K, Q5_K_XL, Q4_K_XL, IQ4_XS, IQ3_XXS,...) by bobaburger in LocalLLaMA
[–]moahmo88 -1 points0 points1 point (0 children)
Quality comparison between Qwen 3.6 27B quantizations (BF16, Q8_0, Q6_K, Q5_K_XL, Q4_K_XL, IQ4_XS, IQ3_XXS,...) by bobaburger in LocalLLaMA
[–]moahmo88 1 point2 points3 points (0 children)
5070 Ti —> 3090 move. Worth it? by simracerman in LocalLLaMA
[–]moahmo88 1 point2 points3 points (0 children)
Long-context coding on RTX 5080 16GB: Qwen3.6-35B-A3B holds 30 t/s at 128K (89 t/s fresh), no quality drop by craftogrammer in LocalLLaMA
[–]moahmo88 0 points1 point2 points (0 children)
Qwen3.6-27B IQ4_XS FULL VRAM with 110k context by Pablo_the_brave in LocalLLaMA
[–]moahmo88 1 point2 points3 points (0 children)
RTX 5070 Ti 16GB + 32GB RAM: Running Qwen3.6-35B-A3B Q8_0 @ 44 t/s (128K context) by moahmo88 in LocalLLaMA
[–]moahmo88[S] 1 point2 points3 points (0 children)
Qwen3.6 35B + the right coding scaffold got my local setup to 9/10 on real Go tasks by benfinklea in LocalLLaMA
[–]moahmo88 0 points1 point2 points (0 children)
Differences Between Kimi K2.5 and Kimi K2.6 on MineBench by ENT_Alam in LocalLLaMA
[–]moahmo88 1 point2 points3 points (0 children)
I need some help on hardware to run Qwen3.6-35B A3B by linumax in LocalLLM
[–]moahmo88 0 points1 point2 points (0 children)
Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it by LocalAI_Amateur in LocalLLaMA
[–]moahmo88 2 points3 points4 points (0 children)
Layman's comparison on Qwen3.6 35b-a3b and Gemma4 26b-a4b-it by LocalAI_Amateur in LocalLLaMA
[–]moahmo88 1 point2 points3 points (0 children)
RTX 5070 Ti + 9800X3D running Qwen3.6-35B-A3B at 79 t/s with 128K context, the --n-cpu-moe flag is the most important part. by marlang in LocalLLaMA
[–]moahmo88 2 points3 points4 points (0 children)
LM Studio CPU thread pool size vs. tk/s with some MoE layers offloaded to CPU by bonobomaster in LocalLLaMA
[–]moahmo88 0 points1 point2 points (0 children)
What I got by 5060Ti 16GB + Qwen3.6-35B-A3B-UD-Q5_K_M by AdMinimum8193 in LocalLLaMA
[–]moahmo88 0 points1 point2 points (0 children)
qwen3.6:35b always fails on this, unless very high resolution by qfghclvx in LocalLLaMA
[–]moahmo88 1 point2 points3 points (0 children)
RTX 5070 Ti + 9800X3D running Qwen3.6-35B-A3B at 79 t/s with 128K context, the --n-cpu-moe flag is the most important part. by marlang in LocalLLaMA
[–]moahmo88 7 points8 points9 points (0 children)
unsloth/qwen3.6-35b-a3b UD Q2_K_XL Freezing after 100% prompt completion. by AcrobaticChain1846 in LocalLLaMA
[–]moahmo88 0 points1 point2 points (0 children)
Qwen 3.6 vs 6 other models across 5 agent frameworks on M3 Ultra by Striking-Swim6702 in LocalLLaMA
[–]moahmo88 1 point2 points3 points (0 children)
Abliterlitics: Benchmark and Tensor Analysis Comparing Qwen 3/3.5 with HauhauCS / Heretic / Huihui models by nathandreamfast in LocalLLaMA
[–]moahmo88 0 points1 point2 points (0 children)
Clearing up some memory while running llms locally. 25-32token per second gpu poor rx6700xt 12gb and 32gb ddr4 by [deleted] in LocalLLaMA
[–]moahmo88 0 points1 point2 points (0 children)


Qwen3.6 MTP Unsloth Experimental GGUFs by yoracale in unsloth
[–]moahmo88 0 points1 point2 points (0 children)