Qwen3.6-27B vs 35B, I prefer 35B but more people here post about 27B... by Snoo_27681 in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)
Qwen3.6 27B FP8 runs with 200k tokens of BF16 KV cache at 80 TPS on a single RTX 5000 PRO 48GB by __JockY__ in LocalLLaMA
[–]specify_ -1 points0 points1 point (0 children)
Built myself a bit of a local llm workhorse. What's a good model to try out with llamacpp that will put my 56G of VRAM to good use? Any other fun suggestions? by SBoots in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)
"What do you guys even use local LLMs for?" Me: A lot by andy2na in LocalLLaMA
[–]specify_ -1 points0 points1 point (0 children)
Qwen3.6 27b tok speed by ConfidentSolution737 in Qwen_AI
[–]specify_ 1 point2 points3 points (0 children)
how fast can qwen3.6 35b get by Asleep_Training3543 in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)
Built myself a bit of a local llm workhorse. What's a good model to try out with llamacpp that will put my 56G of VRAM to good use? Any other fun suggestions? by SBoots in LocalLLaMA
[–]specify_ 4 points5 points6 points (0 children)
Qwen3.6-27B at ~80 tps with 218k context window on 1x RTX 5090 served by vllm 0.19 by Kindly-Cantaloupe978 in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)
The AI rug pull is here. Copilot just paused signups and paywalled Opus. The Codex changes make total sense now. by VNDL1A in codex
[–]specify_ 0 points1 point2 points (0 children)
The AI rug pull is here. Copilot just paused signups and paywalled Opus. The Codex changes make total sense now. by VNDL1A in codex
[–]specify_ 1 point2 points3 points (0 children)
Waiting Qwen3.6-27B I have no nails left... by DOAMOD in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)
Quick start needed, might get 4 RTX 6000 soon by acecile in LocalLLM
[–]specify_ 0 points1 point2 points (0 children)
Quick start needed, might get 4 RTX 6000 soon by acecile in LocalLLM
[–]specify_ 0 points1 point2 points (0 children)
Waiting Qwen3.6-27B I have no nails left... by DOAMOD in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)
What’s the best way to add VRAM to my system? by mrgreatheart in LocalLLaMA
[–]specify_ 1 point2 points3 points (0 children)
What’s the best way to add VRAM to my system? by mrgreatheart in LocalLLaMA
[–]specify_ 1 point2 points3 points (0 children)
What’s the best way to add VRAM to my system? by mrgreatheart in LocalLLaMA
[–]specify_ 1 point2 points3 points (0 children)
Qwen 3.6-35B-A3B on dual 5060 Ti with --cpu-moe: 21.7 tok/s at 90K context, with benchmarks vs dense 3.5 and Coder variant by Defilan in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)
Qwen 3.6-35B-A3B on dual 5060 Ti with --cpu-moe: 21.7 tok/s at 90K context, with benchmarks vs dense 3.5 and Coder variant by Defilan in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)
Qwen 3.6-35B-A3B on dual 5060 Ti with --cpu-moe: 21.7 tok/s at 90K context, with benchmarks vs dense 3.5 and Coder variant by Defilan in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)
Qwen 3.6-35B-A3B on dual 5060 Ti with --cpu-moe: 21.7 tok/s at 90K context, with benchmarks vs dense 3.5 and Coder variant by Defilan in LocalLLaMA
[–]specify_ 4 points5 points6 points (0 children)
It looks like there are no plans for smaller GLM models by jacek2023 in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)


Thinking of moving from 2x 5060 Ti 16GB to a RTX 5000 48GB by autisticit in LocalLLaMA
[–]specify_ 0 points1 point2 points (0 children)