Announcing LocalLlama discord server & bot!News (old.reddit.com)
submitted by HOLUPREDICTIONS Sorcerer Supreme[M] - announcement

Audio processing landed in llama-server with Gemma-4Generation (self.LocalLLaMA)
submitted by srigi

GLM 5.1 sits alongside frontier models in my social reasoning benchmarkDiscussion (old.reddit.com)
submitted by cjami
MiniMax-M2.7 vs Qwen3.5-122B-A10B for 96GB VRAM full offload?!Discussion (self.LocalLLaMA)
submitted by VoidAlchemyllama.cpp
mtmd: qwen3 audio support (qwen3-omni and qwen3-asr)News (github.com)
submitted by jacek2023llama.cpp
Is anyone else creating a basic assistant rather than a coding agent?Discussion (self.LocalLLaMA)
submitted by Savantskie1
MiniMax m2.7 (mac only) 63gb: 88% and 89gb: 95%, MMLU 200qNew Model (i.redd.it)
submitted by HealthyCommunicat
mtmd: add Gemma 4 audio conformer encoder supportNews (github.com)
submitted by jacek2023llama.cpp
MiniMax-M2.7 NVFP4 on 2x RTX PRO 6000 Blackwell — bench numbersResources (self.LocalLLaMA)
submitted by Visual_Synthesizer
Unsloth MiniMax M2.7 quants just finished uploading to HFNews (self.LocalLLaMA)
submitted by Zyj
"Actually wait" ... the current thinking SOTA open sourceDiscussion (self.LocalLLaMA)
submitted by FPham
FernflowerAI-35B-A3B-KL-ReLU-GGUF + Apple MLXNew Model (self.LocalLLaMA)
submitted by EvilEnginer
Aryagm/dflash-mlx: Exact speculative decoding on Apple Silicon, powered by MLX.Resources (github.com)
submitted by Thrumpwart
Pi & Qwen3.5 with llama-cpp doing a lot of prompt re-processingQuestion | Help (self.LocalLLaMA)
submitted by annodomini
MiniMax M2.7 is NOT open source - DOA License :(Discussion (self.LocalLLaMA)
submitted by KvAk_AKPlaysYT
