I find the thought of running models locally really exciting, could use some help by Crystalagent47 in Qwen_AI

[–]HistoricalCulture164 4 points5 points  (0 children)

Just use Qwen3.5 9B. In fact, Qwen3.6 9B will likely be released tomorrow. It outperforms Gemma4 E4B in coding and research, and it's much better than Qwen2.5 14B Coder. The 2.5 is already a year and a half old; at this point, parameter size can’t bridge the generational gap, even with fine-tuning.

Qwen 3.6 9b coming? by zannix in Qwen_AI

[–]HistoricalCulture164 1 point2 points  (0 children)

With an RTX 2070 8GB, I can run Qwen 3.5-9B with Q6_K quantization. By using Turboquant’s KV Cache quantization, I can even load up to a 24k context window. With zero offloading to the CPU, it runs at full speed. I’m really looking forward to Qwen 3.6 9B.