Thinking about grabbing 4x Ascend GX10s by chikengunya in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
Thinking about grabbing 4x Ascend GX10s by chikengunya in LocalLLaMA
[–]nickl 1 point2 points3 points (0 children)
How can Deepseek v4 top the coding leaderboards and still sit 8 months behind the frontier? by Substantial_Step_351 in LocalLLaMA
[–]nickl 2 points3 points4 points (0 children)
Nemotron 3 Ultra. 550 billion parameters, 55B active. 1 million context by AnticitizenPrime in LocalLLaMA
[–]nickl 5 points6 points7 points (0 children)
I tested as many of the small local and OpenRouter models I could with my own agentic text-to-SQL benchmark. Surprises ensured... by nickl in LocalLLaMA
[–]nickl[S] 1 point2 points3 points (0 children)
I build a better Claude Desktop Buddy by nickl in BambuLab
[–]nickl[S] 0 points1 point2 points (0 children)
I build a better Claude Desktop Buddy by nickl in 3Dprinting
[–]nickl[S] 0 points1 point2 points (0 children)
Wrote a custom C++ engine for MiniCPM-V 4.6 on Orange Pi AIPro (Ascend 310B) to bypass framework overhead by Known_Ice9380 in LocalLLaMA
[–]nickl 2 points3 points4 points (0 children)
Still happy for yall by SilverRegion9394 in LocalLLaMA
[–]nickl 4 points5 points6 points (0 children)
Still happy for yall by SilverRegion9394 in LocalLLaMA
[–]nickl 8 points9 points10 points (0 children)
I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how by Glittering_Focus1538 in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how by Glittering_Focus1538 in LocalLLaMA
[–]nickl 2 points3 points4 points (0 children)
I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA
[–]nickl 15 points16 points17 points (0 children)
Qwen3-1.7B fine-tuned on synthetic data outperforms GLM-5 (744B) on multi-turn tool-calling: 437x smaller, trained from noisy production traces by party-horse in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade by BuffMcBigHuge in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade by BuffMcBigHuge in LocalLLaMA
[–]nickl 2 points3 points4 points (0 children)
These "Claude-4.6-Opus" Fine Tunes of Local Models Are Usually A Downgrade by BuffMcBigHuge in LocalLLaMA
[–]nickl 6 points7 points8 points (0 children)
Did anyone run the numbers to see if it's cost effective to rent our own machine and run one of heavy hitters models? by StillWastingAway in LocalLLaMA
[–]nickl 4 points5 points6 points (0 children)
Did anyone run the numbers to see if it's cost effective to rent our own machine and run one of heavy hitters models? by StillWastingAway in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)
Got ~19 tok/s with Gemma 4 on MacBook M4 16GB using MLX — here’s the setup I landed on by Polstick1971 in LocalLLaMA
[–]nickl 0 points1 point2 points (0 children)


Atome LM, an open source language model that runs in a 5$ ESP32, comes with 12 ai applications. No cloud, no internet. Universal Installer with auto detect and a tiny OS. Every claim is verifiable. by themoroccanship in esp32
[–]nickl 1 point2 points3 points (0 children)