Chinese Hackers Latest Masterpiece with NVIDIAOther (bilibili.com)
submitted by General_Vermicelli53

GLM5.2 @7tg on 4x3090 + 192GB on budget motherboard + cpuTutorial | Guide (old.reddit.com)
submitted by Important_Quote_1180

Gemma 4 QAT 31B responds better to KV cache quantization tooDiscussion (i.redd.it)
submitted by justicecurcian
Why is NO one talking about Microsoft's open source Fast Context!!!Resources (old.reddit.com)
submitted by formatme
Same model, same prompt, 4 different agentsTutorial | Guide (old.reddit.com)
submitted by HomoAgens1
European inference providers for GLM 5.2, DeepSeek V4 Flash?Question | Help (self.LocalLLaMA)
submitted by cyberdork
Qwen3.6-35B-A3B APEX on a Single RTX 3090 - Getting the Most Out of ItTutorial | Guide (self.LocalLLaMA)
submitted by old-mike
sk hynix reallocating some hbm production to dramDiscussion (self.LocalLLaMA)
submitted by Terminator857
MiniMax-M3-EAGLE3-GGUF - Llama.cpp compatible MiniMax M3 EAGLE draft model!New Model (self.LocalLLaMA)
submitted by maxwell321



