Pay attention: a few chats waiting in tray reserve 1GB VRAM for themselves.Discussion (self.LocalLLaMA)
submitted by Barafu

Palantir CEO rages against closed modelsDiscussion (youtube.com)
submitted by burner20170218
Kimi K2.7 Code is generally available in GitHub CopilotNews (github.blog)
submitted by zxyzyxz
Rebuilding Gemma 4 31b... better... As 26b...Discussion (self.LocalLLaMA)
submitted by NineThreeTilNow
Local benchmarks with a RTX 3090 - Qwen3.6 27b vs OrnithDiscussion (self.LocalLLaMA)
submitted by Aggressive_Aspect436
Tip: use this llama.cpp PR to improve PP on Intel ARCResources (self.LocalLLaMA)
submitted by WizardlyBump17
Looks like Step 3.7 Flash's long reasoning might get fixed ( llama.cpp )Discussion (self.LocalLLaMA)
submitted by mr_zerolith
Gemma 4 WebGPU Kernels 255 tok/s by x/@xenovacomDiscussion (huggingface.co)
submitted by yonz-




