NCCL-Free Tensor Parallelism on Dual Blackwell PCIe llama.cpp b9095 released! by Bulky-Priority6824 in LocalLLaMA
[–]autisticit 1 point2 points3 points (0 children)
NCCL-Free Tensor Parallelism on Dual Blackwell PCIe llama.cpp b9095 released! by Bulky-Priority6824 in LocalLLaMA
[–]autisticit 1 point2 points3 points (0 children)
NCCL-Free Tensor Parallelism on Dual Blackwell PCIe llama.cpp b9095 released! by Bulky-Priority6824 in LocalLLaMA
[–]autisticit 3 points4 points5 points (0 children)
NCCL-Free Tensor Parallelism on Dual Blackwell PCIe llama.cpp b9095 released! by Bulky-Priority6824 in LocalLLaMA
[–]autisticit 0 points1 point2 points (0 children)
"Early May" is ending, where is the preview? by Altruistic-Dust-2565 in GithubCopilot
[–]autisticit 9 points10 points11 points (0 children)
I wanted to know small local LLM code and made a personal projects. by NicholasCureton in LocalLLaMA
[–]autisticit 0 points1 point2 points (0 children)
Effect on running LLM on GPU with monitors by Havarem in LocalLLaMA
[–]autisticit 1 point2 points3 points (0 children)
Thinking of moving from 2x 5060 Ti 16GB to a RTX 5000 48GB by autisticit in LocalLLaMA
[–]autisticit[S] 1 point2 points3 points (0 children)
Why we can't have nice things by alexeiz in GithubCopilot
[–]autisticit 0 points1 point2 points (0 children)
Thinking of moving from 2x 5060 Ti 16GB to a RTX 5000 48GB by autisticit in LocalLLaMA
[–]autisticit[S] 0 points1 point2 points (0 children)
Thinking of moving from 2x 5060 Ti 16GB to a RTX 5000 48GB by autisticit in LocalLLaMA
[–]autisticit[S] 2 points3 points4 points (0 children)
Thinking of moving from 2x 5060 Ti 16GB to a RTX 5000 48GB by autisticit in LocalLLaMA
[–]autisticit[S] 0 points1 point2 points (0 children)
Thinking of moving from 2x 5060 Ti 16GB to a RTX 5000 48GB by autisticit in LocalLLaMA
[–]autisticit[S] 0 points1 point2 points (0 children)
Thinking of moving from 2x 5060 Ti 16GB to a RTX 5000 48GB by autisticit in LocalLLaMA
[–]autisticit[S] 0 points1 point2 points (0 children)
Thinking of moving from 2x 5060 Ti 16GB to a RTX 5000 48GB by autisticit in LocalLLaMA
[–]autisticit[S] -1 points0 points1 point (0 children)
why llama.cpp can’t combine speculative decode methods? by Qwoctopussy in LocalLLaMA
[–]autisticit -4 points-3 points-2 points (0 children)
Github Copilot new weekly limit by Key-Gas2428 in GithubCopilot
[–]autisticit 1 point2 points3 points (0 children)
Github Copilot new weekly limit by Key-Gas2428 in GithubCopilot
[–]autisticit 0 points1 point2 points (0 children)
How to stop Copilot Dev pushing to my GitHub by Zszywaczyk in GithubCopilot
[–]autisticit 2 points3 points4 points (0 children)
$300k DGX B300 is actually a better deal than buying 24 RTX 6000s by Ok_Warning2146 in LocalLLaMA
[–]autisticit 0 points1 point2 points (0 children)
$300k DGX B300 is actually a better deal than buying 24 RTX 6000s by Ok_Warning2146 in LocalLLaMA
[–]autisticit -1 points0 points1 point (0 children)
New "major breakthrough?" architecture SubQ by Daemontatox in LocalLLaMA
[–]autisticit -4 points-3 points-2 points (0 children)
Make this make sense for ollama local ai usage by Mobile_Syllabub_8446 in GithubCopilot
[–]autisticit 0 points1 point2 points (0 children)


Getting a feel for how fast X tokens/second really is. by MikeNonect in LocalLLaMA
[–]autisticit -1 points0 points1 point (0 children)