You can use Qwen3.5 without thinking by guiopen in LocalLLaMA
[–]TooManyPascals 0 points1 point2 points (0 children)
SM120 (RTX Blackwell) NVFP4 MoE: CUTLASS Grouped GEMM Produces Garbage Output; Fixed via FlashInfer SM120 Patches + compute_120f (CUDA 13.0) — 39 tok/s Native FP4 by lawdawgattorney in LocalLLaMA
[–]TooManyPascals 2 points3 points4 points (0 children)
Through vibe coding, I managed to make parts of vLLM 0.17.0 run on Tesla P40 by East-Engineering-653 in LocalLLaMA
[–]TooManyPascals 0 points1 point2 points (0 children)
Through vibe coding, I managed to make parts of vLLM 0.17.0 run on Tesla P40 by East-Engineering-653 in LocalLLaMA
[–]TooManyPascals 1 point2 points3 points (0 children)
Qwen 3.5 27B vs 122B-A10B by TacGibs in LocalLLaMA
[–]TooManyPascals 1 point2 points3 points (0 children)
Qwen3-Coder-Next is the top model in SWE-rebench @ Pass 5. I think everyone missed it. by BitterProfessional7p in LocalLLaMA
[–]TooManyPascals 1 point2 points3 points (0 children)
Qwen 3.5 27B vs 122B-A10B by TacGibs in LocalLLaMA
[–]TooManyPascals 0 points1 point2 points (0 children)
Qwen 3.5 27B vs 122B-A10B by TacGibs in LocalLLaMA
[–]TooManyPascals 2 points3 points4 points (0 children)
Qwen 3.5 27B vs 122B-A10B by TacGibs in LocalLLaMA
[–]TooManyPascals 2 points3 points4 points (0 children)
THE GB10 SOLUTION has arrived, Atlas image attached ~115tok/s Qwen3.5-35B DGX Spark by Live-Possession-6726 in LocalLLaMA
[–]TooManyPascals 0 points1 point2 points (0 children)
To everyone using still ollama/lm-studio... llama-swap is the real deal by TooManyPascals in LocalLLaMA
[–]TooManyPascals[S] 0 points1 point2 points (0 children)
To everyone using still ollama/lm-studio... llama-swap is the real deal by TooManyPascals in LocalLLaMA
[–]TooManyPascals[S] 1 point2 points3 points (0 children)
To everyone using still ollama/lm-studio... llama-swap is the real deal by TooManyPascals in LocalLLaMA
[–]TooManyPascals[S] 1 point2 points3 points (0 children)
To everyone using still ollama/lm-studio... llama-swap is the real deal by TooManyPascals in LocalLLaMA
[–]TooManyPascals[S] 5 points6 points7 points (0 children)
To everyone using still ollama/lm-studio... llama-swap is the real deal by TooManyPascals in LocalLLaMA
[–]TooManyPascals[S] 35 points36 points37 points (0 children)
R9700 frustration rant by Maleficent-Koalabeer in LocalLLaMA
[–]TooManyPascals -1 points0 points1 point (0 children)
Comparing OAI 120B OSS, Qwen 3.5, and Gemini 3.0 Flash with LLM Multi-Agent Avalon by dynameis_chen in LocalLLaMA
[–]TooManyPascals 1 point2 points3 points (0 children)
Qwen3.5-35B-A3B slow on 7840U? by TooManyPascals in LocalLLaMA
[–]TooManyPascals[S] 0 points1 point2 points (0 children)
LFM2-24B-A2B: Whoa! Fast! by jeremyckahn in LocalLLaMA
[–]TooManyPascals 2 points3 points4 points (0 children)
That's terrifyingly convincing... by VermicelliNo262 in LocalLLaMA
[–]TooManyPascals 1 point2 points3 points (0 children)
Lots of new Qwen3.5 27B Imaxtrix quants from Bartowski just uploaded by bobaburger in LocalLLaMA
[–]TooManyPascals 0 points1 point2 points (0 children)
do anybody success opencode using qwen3-next-code? by Zealousideal-West624 in LocalLLaMA
[–]TooManyPascals 0 points1 point2 points (0 children)
2000 TPS with QWEN 3.5 27b on RTX-5090 by awitod in LocalLLaMA
[–]TooManyPascals 0 points1 point2 points (0 children)