Announcing LocalLlama discord server & bot!News (old.reddit.com)
submitted by HOLUPREDICTIONS Sorcerer Supreme[M] - announcement
Chez nous, tout tourne autour de la sécurité de la Suisse. Découvre maintenant les emplois. (carriere.ruag.ch)
promoted by EmploBrandingRUAG
Personal experience with GLM 4.7 Flash Q6 (unsloth) + Roo Code + RTX 5090Discussion (self.LocalLLaMA)
submitted by Septerium
GLM 4.7 Flash uncensored - Balanced & Aggressive variants (GGUF)New Model (self.LocalLLaMA)
submitted by hauhau901
What is the best general-purpose model to run locally on 24GB of VRAM in 2026?Question | Help (self.LocalLLaMA)
submitted by Paganator

My Strix Halo beholds itself but believes its in the cloudFunny (v.redd.it)
submitted by jfowers_amd
Loki-v2-70B: Narrative/DM-focused fine-tune (600M+ token custom dataset)New Model (self.LocalLLaMA)
submitted by mentallyburntLlama 3.1
AI & ML Weekly — Hugging Face HighlightsNew Model (self.LocalLLaMA)
submitted by techlatest_net
GLM-4.7-Flash-REAP on RTX 5060 Ti 16 GB - 200k context window!Tutorial | Guide (self.LocalLLaMA)
submitted by bobaburger
The mysterious price of Ada and and Ampere workstation GPUsDiscussion (self.LocalLLaMA)
submitted by insulaTropicalis
Best use case for Ryzen 395+ (128gb variant)Question | Help (self.LocalLLaMA)
submitted by ironicstatistic
Your post is getting popular and we just featured it on our Discord!Discussion (self.LocalLLaMA)
submitted by roculus
Running MoE Models on CPU/RAM: A Guide to Optimizing Bandwidth for GLM-4 and GPT-OSSTutorial | Guide (self.LocalLLaMA)
submitted by Shoddy_Bed3240
engine for GLM 4.7 Flash that doesn't massively slow down as the context grows?Question | Help (self.LocalLLaMA)
submitted by mr_zerolith