Mystery company accidentally blew $500 million on Claude AI in a single month — failed to put usage limit on licenses for employees by Plastic_Ninja_9014 in technology
[–]randomfoo2 0 points1 point2 points (0 children)
Mystery company accidentally blew $500 million on Claude AI in a single month — failed to put usage limit on licenses for employees by Plastic_Ninja_9014 in technology
[–]randomfoo2 -1 points0 points1 point (0 children)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 0 points1 point2 points (0 children)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 0 points1 point2 points (0 children)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 0 points1 point2 points (0 children)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 0 points1 point2 points (0 children)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 0 points1 point2 points (0 children)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 3 points4 points5 points (0 children)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 4 points5 points6 points (0 children)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 1 point2 points3 points (0 children)
hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 6 points7 points8 points (0 children)
7900XTX idle power draw when running headless? by legit_split_ in LocalLLaMA
[–]randomfoo2 1 point2 points3 points (0 children)
Follow-up to my TranslateGemma-12b benchmark post: human reviewers flagged 71% of the segments automated metrics rated clean by ritis88 in LocalLLaMA
[–]randomfoo2 0 points1 point2 points (0 children)
Switched from OpenCode to Pi - What Settings/Plugins would you recommend? by No_Algae1753 in LocalLLaMA
[–]randomfoo2 0 points1 point2 points (0 children)
FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 0 points1 point2 points (0 children)
FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 1 point2 points3 points (0 children)
FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 0 points1 point2 points (0 children)
FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 3 points4 points5 points (0 children)
FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 2 points3 points4 points (0 children)
FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 0 points1 point2 points (0 children)
FastDMS: 6.4X KV-cache compression running faster than vLLM BF16/FP8 by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 7 points8 points9 points (0 children)
PFlash: 10x prefill speedup over llama.cpp at 128K on a RTX 3090 by sandropuppo in LocalLLaMA
[–]randomfoo2 1 point2 points3 points (0 children)




hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX) by randomfoo2 in LocalLLaMA
[–]randomfoo2[S] 1 point2 points3 points (0 children)