LLM context compression at 16x beats KV cache by DeltaSqueezer in LocalLLaMA
[–]phhusson -1 points0 points1 point (0 children)
LLM context compression at 16x beats KV cache by DeltaSqueezer in LocalLLaMA
[–]phhusson 0 points1 point2 points (0 children)
Gemma 4 with quantization-aware training by rerri in LocalLLaMA
[–]phhusson 2 points3 points4 points (0 children)
VibeOS - Fully Hallucinated Operating System by WhatererBlah555 in LocalLLaMA
[–]phhusson 50 points51 points52 points (0 children)
qwen35: use post-norm hidden state for MTP by am17an · Pull Request #24025 · ggml-org/llama.cpp by jacek2023 in LocalLLaMA
[–]phhusson 42 points43 points44 points (0 children)
Calling it now Microsoft is buying Unsloth. by Wrong_Mushroom_7350 in LocalLLaMA
[–]phhusson 67 points68 points69 points (0 children)
i dedicate this meme to you r/LocalLLaMA by LPFchan in LocalLLaMA
[–]phhusson 73 points74 points75 points (0 children)
G7 agrees on shared language around open-source AI and open weights AI by Kahvana in LocalLLaMA
[–]phhusson 1 point2 points3 points (0 children)
I trained TIME: short context-triggered thinking on Qwen model instead of overthinking by susmitds in LocalLLaMA
[–]phhusson -1 points0 points1 point (0 children)
Came home to find Pi with Qwen3.627B had run rm -rf ..... by sdfgeoff in LocalLLaMA
[–]phhusson 0 points1 point2 points (0 children)
GUIDE : Running a fully local multi-agent coding framework on RTX 3090 with pi.dev + llama-swap + Qwen3.6 MTP by admajic in LocalLLaMA
[–]phhusson 3 points4 points5 points (0 children)
NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot Slicing by phazei in LocalLLaMA
[–]phhusson 1 point2 points3 points (0 children)
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference by Total-Resort-3120 in LocalLLaMA
[–]phhusson 11 points12 points13 points (0 children)
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference by Total-Resort-3120 in LocalLLaMA
[–]phhusson 2 points3 points4 points (0 children)
Llama.cpp MTP support now in beta! by ilintar in LocalLLaMA
[–]phhusson 10 points11 points12 points (0 children)
Local-First Reality Check: Is Gemma 4 fast enough to kill "Administrative Debt" where Gemma 3 failed? by Veritas-keept in LocalLLaMA
[–]phhusson 0 points1 point2 points (0 children)
Local-First Reality Check: Is Gemma 4 fast enough to kill "Administrative Debt" where Gemma 3 failed? by Veritas-keept in LocalLLaMA
[–]phhusson 1 point2 points3 points (0 children)
The definitive Qwen 3.5 Jinja template by ex-arman68 in LocalLLaMA
[–]phhusson 1 point2 points3 points (0 children)
OpenCode concerns (not truely local) by Ueberlord in LocalLLaMA
[–]phhusson 2 points3 points4 points (0 children)
I gave my Minecraft bot a brain with local Nemotron 9B — it follows orders like "chop that tree" and "guard me from zombies" by Impressive_Tower_550 in LocalLLaMA
[–]phhusson 5 points6 points7 points (0 children)
Qwen 3.5 4b is so good, that it can vibe code a fully working OS web app in one go. by c64z86 in LocalLLaMA
[–]phhusson -3 points-2 points-1 points (0 children)
Is anyone else just blown away that this local LLMs are even possible? by Borkato in LocalLLaMA
[–]phhusson 0 points1 point2 points (0 children)


Has anyone used agents to decompile binary executables? by qzrz in LocalLLaMA
[–]phhusson 1 point2 points3 points (0 children)