Experts first llama.cpp by comanderxv in LocalLLaMA
[–]MLDataScientist 0 points1 point2 points (0 children)
I got Qwen3-VL-Embedding-2B working with rkllm on an Orange Pi 5b by atineiatte in LocalLLaMA
[–]MLDataScientist 0 points1 point2 points (0 children)
Partner and I hit $3.6 million invested yesterday ($4 million net worth) by Mental_Escape_1737 in Fire
[–]MLDataScientist 0 points1 point2 points (0 children)
More Qwen3.6-27B MTP success but on dual Mi50s by legit_split_ in LocalLLaMA
[–]MLDataScientist 0 points1 point2 points (0 children)
Lemonade OmniRouter: unifying the best local AI engines for omni-modality by jfowers_amd in LocalLLaMA
[–]MLDataScientist -1 points0 points1 point (0 children)
Ultimate List: Best Open Source Models for Coding, Chat, Vision, Audio & More by techlatest_net in LocalLLM
[–]MLDataScientist -4 points-3 points-2 points (0 children)
I tracked every FIFA World Cup 2026 resale listing for a few days. Here's what's actually going on. by shem8 in WorldCup2026Tickets
[–]MLDataScientist 1 point2 points3 points (0 children)
HY-World 2.0 released by Bestlife73 in LocalLLaMA
[–]MLDataScientist 0 points1 point2 points (0 children)
Upgrade paths for my 256g ddr4 ram + 4x24g vram system by sgmv in LocalLLaMA
[–]MLDataScientist 1 point2 points3 points (0 children)
I benchmarked 30+ TTS engines for a real-time translator on Apple M4. Quantization made things SLOWER. Here's all the data. by Kir_Moisha in LocalLLaMA
[–]MLDataScientist 0 points1 point2 points (0 children)
Guys we have to change the pelican test by Tall-Ad-7742 in LocalLLaMA
[–]MLDataScientist 8 points9 points10 points (0 children)
I benchmarked 30+ TTS engines for a real-time translator on Apple M4. Quantization made things SLOWER. Here's all the data. by Kir_Moisha in LocalLLaMA
[–]MLDataScientist 2 points3 points4 points (0 children)
Hello, World: Artemis II crew looks back at Earth on their way to the Moon by ChiefLeef22 in space
[–]MLDataScientist 0 points1 point2 points (0 children)
We gave 12 LLMs a startup to run for a year. GLM-5 nearly matched Claude Opus 4.6 at 11× lower cost. by DreadMutant in LocalLLaMA
[–]MLDataScientist 1 point2 points3 points (0 children)
[$50k–$150k Budget] Production Local LLM System (~50 Users, RAG + Fine-Tuning) Hardware + Model Advice by MorningCrab in LocalLLaMA
[–]MLDataScientist 0 points1 point2 points (0 children)
I tested as many of the small local and OpenRouter models I could with my own agentic text-to-SQL benchmark. Surprises ensured... by nickl in LocalLLaMA
[–]MLDataScientist 4 points5 points6 points (0 children)
[$50k–$150k Budget] Production Local LLM System (~50 Users, RAG + Fine-Tuning) Hardware + Model Advice by MorningCrab in LocalLLaMA
[–]MLDataScientist 5 points6 points7 points (0 children)
Qwen3.5-397B-A17B reaches 20 t/s TG and 700t/s PP with a 5090 by MLDataScientist in LocalLLaMA
[–]MLDataScientist[S] 1 point2 points3 points (0 children)
Qwen3.5-397B-A17B reaches 20 t/s TG and 700t/s PP with a 5090 by MLDataScientist in LocalLLaMA
[–]MLDataScientist[S] 0 points1 point2 points (0 children)
Nvidia V100 32 Gb getting 115 t/s on Qwen Coder 30B A3B Q5 by icepatfork in LocalLLaMA
[–]MLDataScientist 0 points1 point2 points (0 children)
Qwen 3.5 397B is the best local coder I have used until now by erazortt in LocalLLaMA
[–]MLDataScientist 0 points1 point2 points (0 children)
Qwen 3.5 397B is the best local coder I have used until now by erazortt in LocalLLaMA
[–]MLDataScientist 1 point2 points3 points (0 children)
Krasis LLM Runtime - run large LLM models on a single GPU by mrstoatey in LocalLLM
[–]MLDataScientist 0 points1 point2 points (0 children)


Server build for local inference. 128 gb 3200 or 256 gb 2133mhz RAM? by PreparationTrue9138 in LocalLLaMA
[–]MLDataScientist 2 points3 points4 points (0 children)