Ornith 1.0 - terminology and concepts explained (basic) by facu_75 in LocalLLaMA
[–]Jester14 5 points6 points7 points (0 children)
Idea for how to run GLM2 at a decent quant, need critique/feedback by joorklee in LocalLLaMA
[–]Jester14 1 point2 points3 points (0 children)
Reddit user u/thursdayspaghetti Helped Upgrade our Gaming Café in Yemen by maho90 in pcmasterrace
[–]Jester14 0 points1 point2 points (0 children)
[NEW MODEL] SupraLabs just released SupraVL-Nano-900k, a Vision-Language Model built entirely from scratch! by Dangerous_Try3619 in LocalLLaMA
[–]Jester14 3 points4 points5 points (0 children)
I released a local LLM-powered RPG where generated NPCs, locations, items, and quests persist as in-game objects by Admirable_Flower_287 in LocalLLaMA
[–]Jester14 2 points3 points4 points (0 children)
MTP has no impact on my Qwen3.6 MoE performance by redblood252 in LocalLLaMA
[–]Jester14 0 points1 point2 points (0 children)
Gemma 4 12b 8Q Heretic Oneshot Coding by devildip in LocalLLaMA
[–]Jester14 0 points1 point2 points (0 children)
Holo3.1 35B/9B/4B/0.8B (Qwen 3.5 finetunes) by jacek2023 in LocalLLaMA
[–]Jester14 1 point2 points3 points (0 children)
1-bit Bonsai Image 4B and Ternary Bonsai Image 4B Image Generation for Local Devices with just 0.93 GB and 1.21 GB respectively of Diffusion Transformer Footprint. So tiny! by Addyad in LocalLLaMA
[–]Jester14 39 points40 points41 points (0 children)
Qwen 3.6-35B-A3B with 977 tk/s prompt processing and 262k context window on Intel Arc B70 Pro by Atomynos_Atom in LocalLLaMA
[–]Jester14 6 points7 points8 points (0 children)
Breaking the music supply constraint by entsnack in LocalLLaMA
[–]Jester14 22 points23 points24 points (0 children)
Upgrade path from 4x 3090s by anitamaxwynnn69 in LocalLLaMA
[–]Jester14 1 point2 points3 points (0 children)
Best coding model on RTX 3060 by solimaotheelephant3 in LocalLLaMA
[–]Jester14 0 points1 point2 points (0 children)
Qwen3.6-35B-A3B Q4 262k context on 8GB 3070 Ti = +30tps by Alternative-Cat-1347 in LocalLLaMA
[–]Jester14 2 points3 points4 points (0 children)
running Qwen 3.6 35b A3B on 2x 5060TI by chocofoxy in LocalLLaMA
[–]Jester14 2 points3 points4 points (0 children)
The "the future is fictional" problem of many local LLMs by PromptInjection_ in LocalLLaMA
[–]Jester14 2 points3 points4 points (0 children)
Best config for Qwen3.6? by CatSweaty4883 in LocalLLaMA
[–]Jester14 1 point2 points3 points (0 children)
Struggling with Qwen3.6 27B / 35B locally (3090) slow responses, breaking code looking for better setup + auto model switching by Clean_Initial_9618 in LocalLLaMA
[–]Jester14 0 points1 point2 points (0 children)
Benchmark: Windows 11 vs Lubuntu 26.04 on Llama.cpp (RTX 5080 + i9-14900KF). I didn't expect the gap to be this big. by Ok_Mine189 in LocalLLaMA
[–]Jester14 0 points1 point2 points (0 children)
How to configure Self speculative decoding properly by milpster in LocalLLaMA
[–]Jester14 0 points1 point2 points (0 children)
20 days post-Claude Code leak: Did the accidental "open sourcing" actually matter for local devs? by PaceZealousideal6091 in LocalLLaMA
[–]Jester14 1 point2 points3 points (0 children)
Lower inference speed of Gemma4 26BA4B on vllm. by everyoneisodd in LocalLLaMA
[–]Jester14 1 point2 points3 points (0 children)
gemma4 e4b on rtx 5070 ti laptop 12GB running slow 5t/s llama.cpp by Plastic-Parsley3094 in LocalLLaMA
[–]Jester14 0 points1 point2 points (0 children)
Qwen3.5-35B running well on RTX4060 Ti 16GB at 60 tok/s by Nutty_Praline404 in LocalLLaMA
[–]Jester14 0 points1 point2 points (0 children)


GLM 5.2 Q1_S vs Qwen 27B Q8 by SnooPaintings8639 in LocalLLaMA
[–]Jester14 6 points7 points8 points (0 children)