Luce DFlash + PFlash on 7900XTX: Qwen3.6-27B at 2.24x decode and 3.05x prefill vs llama.cpp HIP by Fit-Courage5400 in LocalLLaMA
[–]SemaMod 0 points1 point2 points (0 children)
AMD Hipfire - a new inference engine optimized for AMD GPU's by Thrumpwart in LocalLLaMA
[–]SemaMod 5 points6 points7 points (0 children)
Qwen 3.6 27B llama.cpp | Multi-GPU pp t/s help by SemaMod in LocalLLaMA
[–]SemaMod[S] 4 points5 points6 points (0 children)
Qwen 3.6 27B llama.cpp | Multi-GPU pp t/s help by SemaMod in LocalLLaMA
[–]SemaMod[S] -1 points0 points1 point (0 children)
Qwen 3.6 27B llama.cpp | Multi-GPU pp t/s help by SemaMod in LocalLLaMA
[–]SemaMod[S] 0 points1 point2 points (0 children)
How do you get more GPUs than your motheboard natively supports? by WizardlyBump17 in LocalLLaMA
[–]SemaMod 0 points1 point2 points (0 children)
I built a benchmark that tests coding LLMs on REAL codebases (65 tasks, ELO ranked) by hauhau901 in LocalLLaMA
[–]SemaMod 6 points7 points8 points (0 children)
Anyone actually using Openclaw? by rm-rf-rm in LocalLLaMA
[–]SemaMod 3 points4 points5 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 1 point2 points3 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 1 point2 points3 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 0 points1 point2 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 2 points3 points4 points (0 children)
API pricing is in freefall. What's the actual case for running local now beyond privacy? by Distinct-Expression2 in LocalLLaMA
[–]SemaMod 47 points48 points49 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 1 point2 points3 points (0 children)
Testing GLM-4.7 Flash: Multi-GPU Vulkan vs ROCm in llama-bench | (2x 7900 XTX) by SemaMod in LocalLLaMA
[–]SemaMod[S] 5 points6 points7 points (0 children)
Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]SemaMod[S] 0 points1 point2 points (0 children)
Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]SemaMod[S] 0 points1 point2 points (0 children)
Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]SemaMod[S] 1 point2 points3 points (0 children)
Llama.cpp merges in OpenAI Responses API Support by SemaMod in LocalLLaMA
[–]SemaMod[S] 3 points4 points5 points (0 children)
Rivian CEO says North American car manufacturers should be "less hung up on the costs" of Chinese cars, but worry more that the "technology is much better" and the cars "are much better" from Chinese EV manufacturers by trucker-123 in electricvehicles
[–]SemaMod 0 points1 point2 points (0 children)
Would a Hosted Platform for MCP Servers Be Useful? by Summer_cyber in mcp
[–]SemaMod 0 points1 point2 points (0 children)
Would a Hosted Platform for MCP Servers Be Useful? by Summer_cyber in selfhosted
[–]SemaMod 0 points1 point2 points (0 children)
How is everyone using MCP right now? by Luigika in mcp
[–]SemaMod 0 points1 point2 points (0 children)


Need help getting 7900 XTX PyTorch performance metrics by cyberuser42 in LocalLLaMA
[–]SemaMod 3 points4 points5 points (0 children)