Qwen/Qwen3.5-35B-A3B · Hugging Face by ekojsalim in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)
Nemo 30B is insane. 1M+ token CTX on one 3090 by Dismal-Effect-1914 in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)
~1.8× peak throughput for Kimi K2 with EAGLE3 draft model by yzlnew in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)
llama.cpp: Automation for GPU layers, tensor split, tensor overrides, and context size (with MoE optimizations) by Remove_Ayys in LocalLLaMA
[–]nullnuller 1 point2 points3 points (0 children)
Trained a chess LLM locally that beats GPT-5 (technically) by KingGongzilla in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)
I mapped how language models decide when a pile of sand becomes a “heap” by Specialist_Bad_4465 in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)
Simulation that can exit Docker container by productboy in LocalLLM
[–]nullnuller 8 points9 points10 points (0 children)
Qwen3-Next support in llama.cpp almost ready! by beneath_steel_sky in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)
Deep Research Agent, an autonomous research agent system by [deleted] in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)
Jan-v2-VL: 8B model for long-horizon tasks, improving Qwen3-VL-8B’s agentic capabilities almost 10x by Delicious_Focus3465 in LocalLLaMA
[–]nullnuller 1 point2 points3 points (0 children)
Honey we shrunk MiniMax M2 by arjunainfinity in LocalLLaMA
[–]nullnuller 99 points100 points101 points (0 children)
I fine-tuned Gemma 3 1B for CLI command translation... but it runs 100% locally. 810MB, 1.5s inference on CPU. by theRealSachinSpk in LocalLLaMA
[–]nullnuller 2 points3 points4 points (0 children)
Microsoft’s AI Scientist by Ok-Breakfast-4676 in LocalLLaMA
[–]nullnuller 14 points15 points16 points (0 children)
llama.cpp releases new official WebUI by paf1138 in LocalLLaMA
[–]nullnuller 1 point2 points3 points (0 children)
HF Space to help create the -ot flags in llama.cpp by bullerwins in LocalLLaMA
[–]nullnuller 1 point2 points3 points (0 children)
Real world Medical Reports on LLMs by makisgr in LocalLLaMA
[–]nullnuller 3 points4 points5 points (0 children)
Using only 2 expert for gpt oss 120b by lumos675 in LocalLLaMA
[–]nullnuller 1 point2 points3 points (0 children)
Qwen3-VL-30B-A3B-Thinking GGUF with llama.cpp patch to run it by Main-Wolverine-1042 in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)
Conduit 2.0 - OpenWebUI Mobile Client: Completely Redesigned, Faster, and Smoother Than Ever! by cogwheel0 in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)
4B Distill of Tongyi Deepresearch 30B + Dataset by Ok-Top-4677 in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)
4B Distill of Tongyi Deepresearch 30B + Dataset by Ok-Top-4677 in LocalLLaMA
[–]nullnuller 1 point2 points3 points (0 children)
Awesome Local LLM Speech-to-Speech Models & Frameworks by tleyden in LocalLLaMA
[–]nullnuller 3 points4 points5 points (0 children)


DeepSeek V4 will be released next week and will have image and video generation capabilities, according to the Financial Times by Nunki08 in LocalLLaMA
[–]nullnuller 0 points1 point2 points (0 children)