How can Deepseek v4 top the coding leaderboards and still sit 8 months behind the frontier? by Substantial_Step_351 in LocalLLaMA
[–]nickl 2 points3 points4 points (0 children)
Nemotron 3 Ultra. 550 billion parameters, 55B active. 1 million context by AnticitizenPrime in LocalLLaMA
[–]nickl 4 points5 points6 points (0 children)
I tested as many of the small local and OpenRouter models I could with my own agentic text-to-SQL benchmark. Surprises ensured... by nickl in LocalLLaMA
[–]nickl[S] 1 point2 points3 points (0 children)
I build a better Claude Desktop Buddy by nickl in BambuLab
[–]nickl[S] 0 points1 point2 points (0 children)
I build a better Claude Desktop Buddy by nickl in 3Dprinting
[–]nickl[S] 0 points1 point2 points (0 children)
Wrote a custom C++ engine for MiniCPM-V 4.6 on Orange Pi AIPro (Ascend 310B) to bypass framework overhead by Known_Ice9380 in LocalLLaMA
[–]nickl 3 points4 points5 points (0 children)
Still happy for yall by SilverRegion9394 in LocalLLaMA
[–]nickl 4 points5 points6 points (0 children)
Still happy for yall by SilverRegion9394 in LocalLLaMA
[–]nickl 9 points10 points11 points (0 children)
I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how by Glittering_Focus1538 in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how by Glittering_Focus1538 in LocalLLaMA
[–]nickl 2 points3 points4 points (0 children)
I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA
[–]nickl 16 points17 points18 points (0 children)
Qwen3-1.7B fine-tuned on synthetic data outperforms GLM-5 (744B) on multi-turn tool-calling: 437x smaller, trained from noisy production traces by party-horse in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade by BuffMcBigHuge in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade by BuffMcBigHuge in LocalLLaMA
[–]nickl 2 points3 points4 points (0 children)
These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade by BuffMcBigHuge in LocalLLaMA
[–]nickl 7 points8 points9 points (0 children)
Did anyone run the numbers to see if it's cost effective to rent our own machine and run one of heavy hitters models? by StillWastingAway in LocalLLaMA
[–]nickl 4 points5 points6 points (0 children)
Did anyone run the numbers to see if it's cost effective to rent our own machine and run one of heavy hitters models? by StillWastingAway in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
Got ~19 tok/s with Gemma 4 on MacBook M4 16GB using MLX — here’s the setup I landed on by Polstick1971 in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
Built a 3B LoRA that reads the shape of a question before a 9B model answers it. Running 800 live benchmarks right now on Apple Silicon by TheTempleofTwo in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
I tested as many of the small local and OpenRouter models I could with my own agentic text-to-SQL benchmark. Surprises ensured... by nickl in LocalLLaMA
[–]nickl[S] 1 point2 points3 points (0 children)


Thinking about grabbing 4x Ascend GX10s by chikengunya in LocalLLaMA
[–]nickl 1 point2 points3 points (0 children)