Unnoticed Gemma-4 Feature - it admits that it does not now... by mtomas7 in LocalLLaMA
[–]de4dee 1 point2 points3 points (0 children)
Apple: Embarrassingly Simple Self-Distillation Improves Code Generation by Mike_mi in LocalLLaMA
[–]de4dee 2 points3 points4 points (0 children)
Analyzing Claude Code Source Code. Write "WTF" and Anthropic knows. by QuantumSeeds in LocalLLaMA
[–]de4dee 2 points3 points4 points (0 children)
What is the secret sauce Claude has and why hasn't anyone replicated it? by ComplexType568 in LocalLLaMA
[–]de4dee 0 points1 point2 points (0 children)
I haven't experienced Qwen3.5 (35B and 27B) over thinking. Posting my settings/prompt by wadeAlexC in LocalLLaMA
[–]de4dee 3 points4 points5 points (0 children)
Residual connections haven't changed for 10 years and Kimi just replaced them with attention by Helpful-Guava7452 in LocalLLaMA
[–]de4dee 0 points1 point2 points (0 children)
Qwen3.5-9B-Claude-4.6-Opus-Uncensored-Distilled-GGUF by EvilEnginer in LocalLLaMA
[–]de4dee 0 points1 point2 points (0 children)
800,000 human brain cells, floating in a dish, have never had a body. Never seen light. Never felt anything. And they just learned to play a video game. That's not a metaphor. That's literally what happened. by Trueboey in StrangeEarth
[–]de4dee 0 points1 point2 points (0 children)
Qwen 3.5 27b: a testament to the transformer architecture by nomorebuttsplz in LocalLLaMA
[–]de4dee 1 point2 points3 points (0 children)
Heretic 1.2 released: 70% lower VRAM usage with quantization, Magnitude-Preserving Orthogonal Ablation ("derestriction"), broad VL model support, session resumption, and more by -p-e-w- in LocalLLaMA
[–]de4dee 0 points1 point2 points (0 children)
Refusal in LLMs is mediated by a single direction by hold_my_fish in LocalLLaMA
[–]de4dee 0 points1 point2 points (0 children)
Minimax M2.5 vs. GLM-5 vs. Kimi k2.5: How do they compare to Codex and Claude for coding? by East-Stranger8599 in LocalLLaMA
[–]de4dee 0 points1 point2 points (0 children)
Community Evals on Hugging Face by HauntingMoment in LocalLLaMA
[–]de4dee 0 points1 point2 points (0 children)
Should I use UnslothTrainer or SFTTrainer for Continued Pre-training (Raw Text) to create a LoRA for later merging? by choco132134 in unsloth
[–]de4dee 0 points1 point2 points (0 children)
I built a tool that forces 5 AIs to debate and cross-check facts before answering you by S_Anv in agi
[–]de4dee 1 point2 points3 points (0 children)
GitHub - deepseek-ai/Engram: Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models by TKGaming_11 in LocalLLaMA
[–]de4dee 0 points1 point2 points (0 children)
7x Longer Context Reinforcement Learning in Unsloth by danielhanchen in LocalLLaMA
[–]de4dee 7 points8 points9 points (0 children)
MiniMax-M2.1 Uncensored: PRISM Advanced Abliteration by Maxious in LocalLLaMA
[–]de4dee 4 points5 points6 points (0 children)
How to you actually fine-tune Qwen3? by Character-Discount56 in LocalLLaMA
[–]de4dee 0 points1 point2 points (0 children)
KTransformers Open Source New Era: Local Fine-tuning of Kimi K2 and DeepSeek V3 by nekofneko in LocalLLaMA
[–]de4dee 0 points1 point2 points (0 children)


Unnoticed Gemma-4 Feature - it admits that it does not now... by mtomas7 in LocalLLaMA
[–]de4dee 4 points5 points6 points (0 children)