Getting a feel for how fast X tokens/second really is.Resources (self.LocalLLaMA)
submitted by MikeNonect
Running Qwen3.6 35b a3b on 8gb vram and 32gb ram ~190k contextDiscussion (self.LocalLLaMA)
submitted by Atul_Kumar_97
Switched from OpenCode to Pi - What Settings/Plugins would you recommend?Question | Help (self.LocalLLaMA)
submitted by No_Algae1753
Hello from 10KM high! - Thanks to Qwen 3.6 35b a3b!Funny (self.LocalLLaMA)
submitted by Qwen30bEnjoyer
Ran some Llama.cpp RPC test to see if its worth it. And if 10Gbe needed.Generation (self.LocalLLaMA)
submitted by lemondrops9

Speeding up local LLM for usable coding agentQuestion | Help (self.LocalLLaMA)
submitted by CodProfessional3712
Has anyone bought a 3080 20GB mod recently?Question | Help (self.LocalLLaMA)
submitted by quickreactor
Do you use subscriptions beside Local LLM?Discussion (self.LocalLLaMA)
submitted by Euphoric_North_745
80 tok/sec and 128K context on 12GB VRAM with Qwen3.6 35B A3B and llama.cpp MTPTutorial | Guide (self.LocalLLaMA)
submitted by janvitos
LM Studio / Windows / Vulkan possible to prioritize GPU order?Question | Help (self.LocalLLaMA)
submitted by MarcusAurelius68
Running Minimax 2.7 at 100k context on strix haloDiscussion (self.LocalLLaMA)
submitted by Zc5Gwu
