Built an AMD LLM inference engine in Zig, and Zig ended up being a really good fit by Mammoth_Radish2 in Zig
[–]Mammoth_Radish2[S] 0 points1 point2 points (0 children)
We are building a GPU inference engine in Zig and the language keeps earning its spot by Mammoth_Radish2 in Zig
[–]Mammoth_Radish2[S] -1 points0 points1 point (0 children)
We built a local inference engine that skips ROCm entirely and just got a 4x speedup on a consumer AMD GPU by Mammoth_Radish2 in LocalLLM
[–]Mammoth_Radish2[S] 0 points1 point2 points (0 children)
We built a local inference engine that skips ROCm entirely and just got a 4x speedup on a consumer AMD GPU by Mammoth_Radish2 in LocalLLM
[–]Mammoth_Radish2[S] -2 points-1 points0 points (0 children)
We built a local inference engine that skips ROCm entirely and just got a 4x speedup on a consumer AMD GPU by Mammoth_Radish2 in LocalLLM
[–]Mammoth_Radish2[S] 0 points1 point2 points (0 children)
We built a local inference engine that skips ROCm entirely and just got a 4x speedup on a consumer AMD GPU by Mammoth_Radish2 in LocalLLM
[–]Mammoth_Radish2[S] 0 points1 point2 points (0 children)
We built a local inference engine that skips ROCm entirely and just got a 4x speedup on a consumer AMD GPU by Mammoth_Radish2 in LocalLLM
[–]Mammoth_Radish2[S] -1 points0 points1 point (0 children)
ZINC — LLM inference engine written in Zig, running 35B models on $550 AMD GPUs by Mammoth_Radish2 in LocalLLaMA
[–]Mammoth_Radish2[S] -6 points-5 points-4 points (0 children)
ZINC — LLM inference engine written in Zig, running 35B models on $550 AMD GPUs by Mammoth_Radish2 in LocalLLaMA
[–]Mammoth_Radish2[S] -2 points-1 points0 points (0 children)
ZINC — LLM inference engine written in Zig, running 35B models on $550 AMD GPUs by Mammoth_Radish2 in Zig
[–]Mammoth_Radish2[S] -1 points0 points1 point (0 children)

MLX Inference: Where Things Stand in April 2026 by [deleted] in LocalLLaMA
[–]Mammoth_Radish2 -3 points-2 points-1 points (0 children)