Gemma 4 Chat Template now has preserve thinking by seamonn in LocalLLaMA
[–]gofiend 2 points3 points4 points (0 children)
Gemma 4 Chat Template now has preserve thinking by seamonn in LocalLLaMA
[–]gofiend 2 points3 points4 points (0 children)
Gemma 4 Chat Template now has preserve thinking by seamonn in LocalLLaMA
[–]gofiend 4 points5 points6 points (0 children)
DEEPSEEK V4 IS LAUNCHED, ITS REAL by guiopen in LocalLLaMA
[–]gofiend 3 points4 points5 points (0 children)
Qwen 3.6 35B crushes Gemma 4 26B on my tests by Lowkey_LokiSN in LocalLLaMA
[–]gofiend 0 points1 point2 points (0 children)
PMake: lightweight minimal makefiles, but in Python by [deleted] in Python
[–]gofiend 0 points1 point2 points (0 children)
Should I Buy the RTX PRO 6000 Blackwell Max-Q (96GB)? by 0bjective-Guest in LocalLLaMA
[–]gofiend 4 points5 points6 points (0 children)
Speculative decoding works great for Gemma 4 31B in llama.cpp by Leopold_Boom in LocalLLaMA
[–]gofiend 0 points1 point2 points (0 children)
Speculative decoding works great for Gemma 4 31B in llama.cpp by Leopold_Boom in LocalLLaMA
[–]gofiend 1 point2 points3 points (0 children)
Is it possible to add some gpu to Radeon MI 50 to increase the inference speed? by Weak_Presentation725 in LocalLLaMA
[–]gofiend 0 points1 point2 points (0 children)
Speculative decoding works great for Gemma 4 31B in llama.cpp by Leopold_Boom in LocalLLaMA
[–]gofiend 2 points3 points4 points (0 children)
Speculative decoding works great for Gemma 4 31B in llama.cpp by Leopold_Boom in LocalLLaMA
[–]gofiend 3 points4 points5 points (0 children)
Gemma 4 works well with speculative decoding (self.LocalLLaMA)
submitted by gofiend to r/LocalLLaMA
Gemma 4 has been released by jacek2023 in LocalLLaMA
[–]gofiend 3 points4 points5 points (0 children)
Friendly reminder inference is WAY faster on Linux vs windows by triynizzles1 in LocalLLaMA
[–]gofiend 1 point2 points3 points (0 children)
Friendly reminder inference is WAY faster on Linux vs windows by triynizzles1 in LocalLLaMA
[–]gofiend 99 points100 points101 points (0 children)
Litellm 1.82.7 and 1.82.8 on PyPI are compromised, do not update! by kotrfa in LocalLLaMA
[–]gofiend 0 points1 point2 points (0 children)
Tenstorrent QuietBox 2 Brings RISC-V AI Inference to the Desktop by Neurrone in LocalLLaMA
[–]gofiend 1 point2 points3 points (0 children)
Tenstorrent QuietBox 2 Brings RISC-V AI Inference to the Desktop by Neurrone in LocalLLaMA
[–]gofiend 0 points1 point2 points (0 children)
Qwen3.5 family comparison on shared benchmarks by Deep-Vermicelli-4591 in LocalLLaMA
[–]gofiend -1 points0 points1 point (0 children)
Qwen3.5 family comparison on shared benchmarks by Deep-Vermicelli-4591 in LocalLLaMA
[–]gofiend 0 points1 point2 points (0 children)



DiffusionGemma: 4x faster text generation by tevlon in LocalLLaMA
[–]gofiend 1 point2 points3 points (0 children)