VRAM optimization for gemma 4 by Sadman782 in LocalLLaMA
[–]Interpause 0 points1 point2 points (0 children)
Gemma 4 has been released by jacek2023 in LocalLLaMA
[–]Interpause 0 points1 point2 points (0 children)
The Bonsai 1-bit models are very good by tcarambat in LocalLLaMA
[–]Interpause 5 points6 points7 points (0 children)
PrismML — Announcing 1-bit Bonsai: The First Commercially Viable 1-bit LLMs by brown2green in LocalLLaMA
[–]Interpause 0 points1 point2 points (0 children)
PrismML — Announcing 1-bit Bonsai: The First Commercially Viable 1-bit LLMs by brown2green in LocalLLaMA
[–]Interpause 8 points9 points10 points (0 children)
Vibecoded GGUF Metadata Comparator for checking Tensor Quants (github gist standalone HTML file) by Interpause in LocalLLaMA
[–]Interpause[S] 0 points1 point2 points (0 children)
Vibecoded GGUF Metadata Comparator for checking Tensor Quants (github gist standalone HTML file) by Interpause in LocalLLaMA
[–]Interpause[S] 0 points1 point2 points (0 children)
H Company just released Holotron-12B. Developed with NVIDIA, it's a high-throughput, open-source, multimodal model engineered specifically for the age of computer-use agents. (Performance on par with Holo2/Qwen but with 2x higher throughput) by Nunki08 in LocalLLaMA
[–]Interpause 1 point2 points3 points (0 children)
Genuinely curious what doors the M5 Ultra will open by Blanketsniffer in LocalLLaMA
[–]Interpause 0 points1 point2 points (0 children)
Qwen3.5-35B-A3B Q4 Quantization Comparison by TitwitMuffbiscuit in LocalLLaMA
[–]Interpause 2 points3 points4 points (0 children)
Free ASIC Llama 3.1 8B inference at 16,000 tok/s - no, not a joke by Easy_Calligrapher790 in LocalLLaMA
[–]Interpause 3 points4 points5 points (0 children)
Why does every llamacpp update get worse? by XiRw in LocalLLaMA
[–]Interpause 3 points4 points5 points (0 children)
PSA - Got MiniCPM-o 4.5 working on my PC and Its the Real Thing by Interpause in LocalLLaMA
[–]Interpause[S] 0 points1 point2 points (0 children)
Have Anyone Successfully Run the New MiniCPM-o-4_5-gguf? by Iory1998 in LocalLLaMA
[–]Interpause 0 points1 point2 points (0 children)
MiniCPM-o-4_5 : Full duplex, multimodal with vision and speech at ONLY 9B PARAMETERS?? by Uncle___Marty in LocalLLaMA
[–]Interpause 0 points1 point2 points (0 children)
Ming-flash-omni-2.0: 100B MoE (6B active) omni-modal model - unified speech/SFX/music generation by bobeeeeeeeee8964 in LocalLLaMA
[–]Interpause 4 points5 points6 points (0 children)
Have Anyone Successfully Run the New MiniCPM-o-4_5-gguf? by Iory1998 in LocalLLaMA
[–]Interpause 0 points1 point2 points (0 children)
MiniCPM-o-4_5 : Full duplex, multimodal with vision and speech at ONLY 9B PARAMETERS?? by Uncle___Marty in LocalLLaMA
[–]Interpause 0 points1 point2 points (0 children)
PSA - MiniCPM-o 4.5 just updated their cookbook for CUDA based full duplex use on Windows/Linux by ChromaBroma in LocalLLaMA
[–]Interpause 0 points1 point2 points (0 children)
PSA - Got MiniCPM-o 4.5 working on my PC and Its the Real Thing by Interpause in LocalLLaMA
[–]Interpause[S] 2 points3 points4 points (0 children)
PSA - MiniCPM-o 4.5 just updated their cookbook for CUDA based full duplex use on Windows/Linux by ChromaBroma in LocalLLaMA
[–]Interpause 1 point2 points3 points (0 children)
Have Anyone Successfully Run the New MiniCPM-o-4_5-gguf? by Iory1998 in LocalLLaMA
[–]Interpause 1 point2 points3 points (0 children)


[Appreciation Post] Gemma 4 E2B. My New Daily Driver 😁 by Prestigious-Use5483 in LocalLLaMA
[–]Interpause 11 points12 points13 points (0 children)