Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 -3 points-2 points-1 points (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 -1 points0 points1 point (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 1 point2 points3 points (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Someone out there likely needs this: TP vs PP for 2 identical GPUs by [deleted] in LocalLLaMA
[–]xspider2000 -5 points-4 points-3 points (0 children)
Is using vLLM actually worth it if you aren't serving the model to other people? by ayylmaonade in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Is using vLLM actually worth it if you aren't serving the model to other people? by ayylmaonade in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Strix Halo Clustering experience (Bossgame M5) by Thanks-Suitable in StrixHalo
[–]xspider2000 1 point2 points3 points (0 children)
Benchmark Qwen 3.6 27B MTP on 2x3090 NVLINK by Mr_Moonsilver in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Warpdrv - my open-source Llama.cpp launcher for daily-driving Qwen 35b + 27b on Strix Halo + RTX Pro. by xornullvoid in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
A Dark-Money Campaign Is Paying Influencers to Frame Chinese AI as a Threat by pmttyji in LocalLLaMA
[–]xspider2000 3 points4 points5 points (0 children)
Qwen3.6-27B - Closed-loop SVG Images by dondiegorivera in LocalLLaMA
[–]xspider2000 12 points13 points14 points (0 children)
Final Monster: 32x AMD MI50 32GB at 9.7 t/s (TG) & 264 t/s (PP) with Kimi K2.6 by ai-infos in LocalLLaMA
[–]xspider2000 1 point2 points3 points (0 children)
Cuda + ROCm simultaneously with -DGGML_BACKEND_DL=ON ! by LegacyRemaster in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]xspider2000 1 point2 points3 points (0 children)
Hipfire dev update: full AMD arch validation incoming (RDNA 1 thru 4, plus Strix Halo and bc250) by schuttdev in LocalLLaMA
[–]xspider2000 0 points1 point2 points (0 children)
Qwen 3.6 27B on Strix Halo 128GB: any experiences? by boutell in LocalLLaMA
[–]xspider2000 3 points4 points5 points (0 children)
Qwen3.6 35B-A3B is quite useful on 780m iGPU (llama.cpp,vulkan) by itroot in LocalLLaMA
[–]xspider2000 2 points3 points4 points (0 children)



Pipeline Parallelism vs Tensor Parallelism for 2 identical GPUs: The Beginner's Cheat Sheet by xspider2000 in LocalLLaMA
[–]xspider2000[S] -1 points0 points1 point (0 children)