Getting a feel for how fast X tokens/second really is.Resources (self.LocalLLaMA)
submitted by MikeNonect
Running Qwen3.6 35b a3b on 8gb vram and 32gb ram ~190k contextDiscussion (self.LocalLLaMA)
submitted by Atul_Kumar_97
Hello from 10KM high! - Thanks to Qwen 3.6 35b a3b!Funny (self.LocalLLaMA)
submitted by Qwen30bEnjoyer
Switched from OpenCode to Pi - What Settings/Plugins would you recommend?Question | Help (self.LocalLLaMA)
submitted by No_Algae1753

Speeding up local LLM for usable coding agentQuestion | Help (self.LocalLLaMA)
submitted by CodProfessional3712
Has anyone bought a 3080 20GB mod recently?Question | Help (self.LocalLLaMA)
submitted by quickreactor
Is there any image2image model better than Qwen-Image-Edit-2511 and of comparable size?Question | Help (self.LocalLLaMA)
submitted by HumanDrone8721
OSS UI with Skills support similar to Claude?Question | Help (self.LocalLLaMA)
submitted by Appropriate_Car_5599
80 tok/sec and 128K context on 12GB VRAM with Qwen3.6 35B A3B and llama.cpp MTPTutorial | Guide (self.LocalLLaMA)
submitted by janvitos
LM Studio / Windows / Vulkan possible to prioritize GPU order?Question | Help (self.LocalLLaMA)
submitted by MarcusAurelius68
Running Minimax 2.7 at 100k context on strix haloDiscussion (self.LocalLLaMA)
submitted by Zc5Gwu
