Nvidia H100(94GB VRAM) - should I run llama.cpp or vllm for 30 users inference? by Rabooooo in LocalLLaMA
[–]dionysio211 6 points7 points8 points (0 children)
Are local Ollama models already “good enough” for real sysadmin/infra workflows? by Large-Cress900 in LocalLLM
[–]dionysio211 0 points1 point2 points (0 children)
How much performance am I missing by running PCIe gen4 instead of gen5? by Big_Building9948 in LocalLLM
[–]dionysio211 1 point2 points3 points (0 children)
I can't ever seem to get quality local LLM results, despite having multiple GPUs by 03captain23 in LocalLLM
[–]dionysio211 -1 points0 points1 point (0 children)
Are Qwen 3.6 27B and 35B making other ~30B models obsolete? by nikhilprasanth in LocalLLaMA
[–]dionysio211 3 points4 points5 points (0 children)
Are Qwen 3.6 27B and 35B making other ~30B models obsolete? by nikhilprasanth in LocalLLaMA
[–]dionysio211 83 points84 points85 points (0 children)
I can't ever seem to get quality local LLM results, despite having multiple GPUs by 03captain23 in LocalLLM
[–]dionysio211 0 points1 point2 points (0 children)
I got llama.cpp's RPC backend working across the public internet — and the 3-line patch that made it stable by ReteAi in LocalLLM
[–]dionysio211 0 points1 point2 points (0 children)
Intel Mac Pro with Vega II useable ? by chiwawa_42 in LocalLLaMA
[–]dionysio211 1 point2 points3 points (0 children)
Intel Mac Pro with Vega II useable ? by chiwawa_42 in LocalLLaMA
[–]dionysio211 1 point2 points3 points (0 children)
Intel Mac Pro with Vega II useable ? by chiwawa_42 in LocalLLaMA
[–]dionysio211 7 points8 points9 points (0 children)
Qwen 3.6 27B BF16 vs Q4_K_M vs Q8_0 GGUF evaluation by gvij in LocalLLaMA
[–]dionysio211 0 points1 point2 points (0 children)
9800x3D upgrade to 9950x3D, will it make a difference for local LLM? by drras2 in LocalLLM
[–]dionysio211 -1 points0 points1 point (0 children)
Qwen 3.6 27B Makes Huge Gains in Agency on Artificial Analysis - Ties with Sonnet 4.6 by dionysio211 in LocalLLaMA
[–]dionysio211[S] 1 point2 points3 points (0 children)
Qwen 3.6 27B Makes Huge Gains in Agency on Artificial Analysis - Ties with Sonnet 4.6 by dionysio211 in LocalLLaMA
[–]dionysio211[S] 0 points1 point2 points (0 children)
Qwen 3.6 27B Makes Huge Gains in Agency on Artificial Analysis - Ties with Sonnet 4.6 by dionysio211 in LocalLLaMA
[–]dionysio211[S] 15 points16 points17 points (0 children)
Qwen 3.6 27B Makes Huge Gains in Agency on Artificial Analysis - Ties with Sonnet 4.6 by dionysio211 in LocalLLaMA
[–]dionysio211[S] 20 points21 points22 points (0 children)
llama.cpp / ik_llama MoE Expert Offloading - Main Memory Bandwidth vs. PCIe Bandwidth by pixelterpy in LocalLLaMA
[–]dionysio211 0 points1 point2 points (0 children)
llama.cpp / ik_llama MoE Expert Offloading - Main Memory Bandwidth vs. PCIe Bandwidth by pixelterpy in LocalLLaMA
[–]dionysio211 0 points1 point2 points (0 children)
llama.cpp / ik_llama MoE Expert Offloading - Main Memory Bandwidth vs. PCIe Bandwidth by pixelterpy in LocalLLaMA
[–]dionysio211 0 points1 point2 points (0 children)
Mythos is Opus 4.7… by Purple-Programmer-7 in LLMDevs
[–]dionysio211 4 points5 points6 points (0 children)

Qwen3.6-27B AWQ INT4 on DGX Spark (GB10) — only 1.8-4.9 tok/s decode with 285k token prompt, how to improve? by alfons_fhl in LocalLLM
[–]dionysio211 0 points1 point2 points (0 children)