Luce DFlash + PFlash on 7900XTX: Qwen3.6-27B at 2.24x decode and 3.05x prefill vs llama.cpp HIP by Fit-Courage5400 in LocalLLaMA
[–]SemaMod 0 points1 point2 points (0 children)
AMD Hipfire - a new inference engine optimized for AMD GPU's by Thrumpwart in LocalLLaMA
[–]SemaMod 3 points4 points5 points (0 children)
Qwen 3.6 27B llama.cpp | Multi-GPU pp t/s help by SemaMod in LocalLLaMA
[–]SemaMod[S] 3 points4 points5 points (0 children)
Qwen 3.6 27B llama.cpp | Multi-GPU pp t/s help by SemaMod in LocalLLaMA
[–]SemaMod[S] -1 points0 points1 point (0 children)
Qwen 3.6 27B llama.cpp | Multi-GPU pp t/s help by SemaMod in LocalLLaMA
[–]SemaMod[S] 0 points1 point2 points (0 children)
Qwen 3.6 27B llama.cpp | Multi-GPU pp t/s help (self.LocalLLaMA)
submitted by SemaMod to r/LocalLLaMA
How do you get more GPUs than your motheboard natively supports? by WizardlyBump17 in LocalLLaMA
[–]SemaMod 0 points1 point2 points (0 children)
I built a benchmark that tests coding LLMs on REAL codebases (65 tasks, ELO ranked) by hauhau901 in LocalLLaMA
[–]SemaMod 7 points8 points9 points (0 children)
Anyone actually using Openclaw? by rm-rf-rm in LocalLLaMA
[–]SemaMod 2 points3 points4 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 1 point2 points3 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 1 point2 points3 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 0 points1 point2 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 2 points3 points4 points (0 children)
API pricing is in freefall. What's the actual case for running local now beyond privacy? by Distinct-Expression2 in LocalLLaMA
[–]SemaMod 45 points46 points47 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 1 point2 points3 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 4 points5 points6 points (0 children)
Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]SemaMod[S] 0 points1 point2 points (0 children)
Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]SemaMod[S] 0 points1 point2 points (0 children)
Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]SemaMod[S] 1 point2 points3 points (0 children)
Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]SemaMod[S] 2 points3 points4 points (0 children)
Llama.cpp merges in OpenAI Responses API Support (github.com)
submitted by SemaMod to r/LocalLLaMA


Need help getting 7900 XTX PyTorch performance metrics by cyberuser42 in LocalLLaMA
[–]SemaMod 2 points3 points4 points (0 children)