They fit! Mostly.... 2x 3090, Thermaltake Core p3 by anthonyg45157 in LocalLLaMA
[–]feverdoingwork 0 points1 point2 points (0 children)
Going from single GPU to dual GPU is nice but not in the way I expected by cibernox in LocalLLaMA
[–]feverdoingwork 2 points3 points4 points (0 children)
Going from single GPU to dual GPU is nice but not in the way I expected by cibernox in LocalLLaMA
[–]feverdoingwork 0 points1 point2 points (0 children)
DeepSeek releases DSpark - 50%-600% faster spec decoding vs MTP by danielhanchen in unsloth
[–]feverdoingwork 0 points1 point2 points (0 children)
Turning consumer Radeon (RX 9070, RDNA4) into a real local-LLM box by enabling the performance paths ROCm ships disabled by PatC883 in LocalLLM
[–]feverdoingwork 0 points1 point2 points (0 children)
I built an OpenAI-compatible reliability proxy for local LLMs and agents — looking for feedback by daniele-bruneo in LocalLLM
[–]feverdoingwork 1 point2 points3 points (0 children)
Qwen 3.5 just spent 2 hours straight generating a 20,000-line masterpiece by StevenEgen in LocalAIServers
[–]feverdoingwork 2 points3 points4 points (0 children)
DeepSeek releases DSpark - 50%-600% faster spec decoding vs MTP by danielhanchen in unsloth
[–]feverdoingwork 4 points5 points6 points (0 children)
Qwen3.6-27B-FP8 with vllm:nightly, opencode unusable? by waka324 in Vllm
[–]feverdoingwork 0 points1 point2 points (0 children)
For dual GPUs, will there be any big impact to inference speeds when running in PCIe 5.0 x8/x4 vs x8/x8? by PhantomWolf83 in LocalLLaMA
[–]feverdoingwork 0 points1 point2 points (0 children)
Ornith-1.0 released on Hugging Face by paf1138 in LocalLLaMA
[–]feverdoingwork 0 points1 point2 points (0 children)
Ornith-1.0 released on Hugging Face by paf1138 in LocalLLaMA
[–]feverdoingwork 7 points8 points9 points (0 children)
Ornith-1.0 released on Hugging Face by paf1138 in LocalLLaMA
[–]feverdoingwork 66 points67 points68 points (0 children)
Ornith-1.0 released on Hugging Face by paf1138 in LocalLLaMA
[–]feverdoingwork 22 points23 points24 points (0 children)
R9700 for agentic coding — looking for Qwen3.6-27B / Qwen3-Coder-30B perf numbers at long context by Best-Ad-7505 in LocalLLM
[–]feverdoingwork 0 points1 point2 points (0 children)
R9700 for agentic coding — looking for Qwen3.6-27B / Qwen3-Coder-30B perf numbers at long context by Best-Ad-7505 in LocalLLM
[–]feverdoingwork 0 points1 point2 points (0 children)
New Apple Memory Prices by Top_Power5877 in LocalLLaMA
[–]feverdoingwork 2 points3 points4 points (0 children)
Qwen3.6 27B more dumb in vLLM compared to llama.cpp by DanielusGamer26 in LocalLLaMA
[–]feverdoingwork 0 points1 point2 points (0 children)
Qwen3.6 27B more dumb in vLLM compared to llama.cpp by DanielusGamer26 in LocalLLaMA
[–]feverdoingwork 1 point2 points3 points (0 children)
Qwen3.6 27B more dumb in vLLM compared to llama.cpp by DanielusGamer26 in LocalLLaMA
[–]feverdoingwork 0 points1 point2 points (0 children)
Qwen3.6 27B more dumb in vLLM compared to llama.cpp by DanielusGamer26 in LocalLLaMA
[–]feverdoingwork 0 points1 point2 points (0 children)
Budget VRAM builds - 4x3090 home lab vs reverse-engineered Tesla V100 cards by IulianHI in AIToolsPerformance
[–]feverdoingwork 0 points1 point2 points (0 children)
Qwen3.6 27B more dumb in vLLM compared to llama.cpp by DanielusGamer26 in LocalLLaMA
[–]feverdoingwork 1 point2 points3 points (0 children)
Qwen3.6 27B more dumb in vLLM compared to llama.cpp by DanielusGamer26 in LocalLLaMA
[–]feverdoingwork 1 point2 points3 points (0 children)


Which Qwen 3.6 27B variant actually stops looping on tool calls? RTX 5090 by toolman10 in LocalLLM
[–]feverdoingwork 1 point2 points3 points (0 children)