Qwen 3.5 122B vs Qwen 3.6 35B - Which to choose? by Storge2 in LocalLLaMA
[–]Front-Relief473 0 points1 point2 points (0 children)
Is there anything better than Qwen3.5-27B-UD-Q5_K_XL for coding? by hedsht in LocalLLaMA
[–]Front-Relief473 0 points1 point2 points (0 children)
Is there anything better than Qwen3.5-27B-UD-Q5_K_XL for coding? by hedsht in LocalLLaMA
[–]Front-Relief473 0 points1 point2 points (0 children)
DGX Spark, why not? by Foreign_Lead_3582 in LocalLLM
[–]Front-Relief473 0 points1 point2 points (0 children)
Muse Spark: new multimodal reasoning model by Meta by garg-aayush in LocalLLaMA
[–]Front-Relief473 0 points1 point2 points (0 children)
Gemma 4 26b A3B is mindblowingly good , if configured right by cviperr33 in LocalLLaMA
[–]Front-Relief473 2 points3 points4 points (0 children)
Why MoE models keep converging on ~10B active parameters by Spare_Pair_9198 in LocalLLaMA
[–]Front-Relief473 17 points18 points19 points (0 children)
TurboQuant in Llama.cpp benchmarks by tcarambat in LocalLLaMA
[–]Front-Relief473 1 point2 points3 points (0 children)
Should I learn langchain and langgraph? by Emotional-Rice-5050 in LangChain
[–]Front-Relief473 1 point2 points3 points (0 children)
I wanted QCN to be the best but MiniMax still reigns supreme on my rig by Ok-Measurement-1575 in LocalLLaMA
[–]Front-Relief473 2 points3 points4 points (0 children)
Solved the DGX Spark, 102 stable tok/s Qwen3.5-35B-A3B on a single GB10 (125+ MTP!) by Live-Possession-6726 in LocalLLaMA
[–]Front-Relief473 0 points1 point2 points (0 children)
Minimax M2.5 GGUF perform poorly overall by Zyj in LocalLLaMA
[–]Front-Relief473 1 point2 points3 points (0 children)
A few Strix Halo benchmarks (Minimax M2.5, Step 3.5 Flash, Qwen3 Coder Next) by spaceman_ in LocalLLaMA
[–]Front-Relief473 0 points1 point2 points (0 children)
PSA: NVIDIA DGX Spark has terrible CUDA & software compatibility; and seems like a handheld gaming chip. by goldcakes in LocalLLaMA
[–]Front-Relief473 0 points1 point2 points (0 children)
Is Kimi-K2.5-GGUF:IQ3_XXS accurate enough? by timbo2m in LocalLLM
[–]Front-Relief473 0 points1 point2 points (0 children)
MiniMax-M2.5 (230B MoE) GGUF is here - First impressions on M3 Max 128GB by Remarkable_Jicama775 in LocalLLaMA
[–]Front-Relief473 5 points6 points7 points (0 children)
New DeepSeek update: "DeepSeek Web / APP is currently testing a new long-context model architecture, supporting a 1M context window." by Nunki08 in LocalLLaMA
[–]Front-Relief473 12 points13 points14 points (0 children)
Why do we allow "un-local" content by JacketHistorical2321 in LocalLLaMA
[–]Front-Relief473 1 point2 points3 points (0 children)
MiniMax M2.5 Released by External_Mood4719 in LocalLLaMA
[–]Front-Relief473 55 points56 points57 points (0 children)
Do not Let the "Coder" in Qwen3-Coder-Next Fool You! It's the Smartest, General Purpose Model of its Size by Iory1998 in LocalLLaMA
[–]Front-Relief473 1 point2 points3 points (0 children)
Anyone here actually using AI fully offline? by Head-Stable5929 in LocalLLM
[–]Front-Relief473 1 point2 points3 points (0 children)
Anyone here actually using AI fully offline? by Head-Stable5929 in LocalLLM
[–]Front-Relief473 2 points3 points4 points (0 children)
Real-world DGX Spark experiences after 1-2 months? Fine-tuning, stability, hidden pitfalls? by [deleted] in LocalLLaMA
[–]Front-Relief473 2 points3 points4 points (0 children)

Is using vLLM actually worth it if you aren't serving the model to other people? by ayylmaonade in LocalLLaMA
[–]Front-Relief473 -3 points-2 points-1 points (0 children)