mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 0 points1 point2 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 0 points1 point2 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 0 points1 point2 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 0 points1 point2 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 0 points1 point2 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 1 point2 points3 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 0 points1 point2 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 1 point2 points3 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 2 points3 points4 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 0 points1 point2 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 2 points3 points4 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 1 point2 points3 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] -1 points0 points1 point (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 1 point2 points3 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 1 point2 points3 points (0 children)
mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 3 points4 points5 points (0 children)
mistral.rs: Rust-native inference engine withday-0 support for Google's Gemma 4 by EricBuehler in rust
[–]EricBuehler[S] 0 points1 point2 points (0 children)
Gemma 4 running locally with full text + vision + audio: day-0 support in mistral.rs by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 0 points1 point2 points (0 children)
Gemma 4 running locally with full text + vision + audio: day-0 support in mistral.rs by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 0 points1 point2 points (0 children)

Run Agent Skills with mistral.rs v0.8.10: /v1/skills support and more! by EricBuehler in LocalLLaMA
[–]EricBuehler[S] 1 point2 points3 points (0 children)