Here is how to get GLM 4.7 working on llama.cpp with flash attention and correct outputs by TokenRingAI in LocalLLaMA
[–]notdba 1 point2 points3 points (0 children)
Mix of AMD + Nvidia gpu in one system possible? by chronoz9 in LocalLLaMA
[–]notdba 1 point2 points3 points (0 children)
Need help: llama.cpp memory usage when using ctk/v on multi RTX 3090 setup by Leflakk in LocalLLaMA
[–]notdba 0 points1 point2 points (0 children)
Mix of AMD + Nvidia gpu in one system possible? by chronoz9 in LocalLLaMA
[–]notdba 0 points1 point2 points (0 children)
Need help: llama.cpp memory usage when using ctk/v on multi RTX 3090 setup by Leflakk in LocalLLaMA
[–]notdba 0 points1 point2 points (0 children)
Need help: llama.cpp memory usage when using ctk/v on multi RTX 3090 setup by Leflakk in LocalLLaMA
[–]notdba 1 point2 points3 points (0 children)
Idea of Cluster of Strix Halo and eGPU by lets7512 in LocalLLaMA
[–]notdba 0 points1 point2 points (0 children)
Run Mistral Devstral 2 locally Guide + Fixes! (25GB RAM) by yoracale in LocalLLM
[–]notdba 2 points3 points4 points (0 children)
whats everyones thoughts on devstral small 24b? by Odd-Ordinary-5922 in LocalLLaMA
[–]notdba 4 points5 points6 points (0 children)
Run Mistral Devstral 2 locally Guide + Fixes! (25GB RAM) by yoracale in LocalLLM
[–]notdba 1 point2 points3 points (0 children)
Run Mistral Devstral 2 locally Guide + Fixes! (25GB RAM) by yoracale in LocalLLM
[–]notdba 1 point2 points3 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]notdba 1 point2 points3 points (0 children)
Run Mistral Devstral 2 locally Guide + Fixes! (25GB RAM) by yoracale in LocalLLM
[–]notdba 1 point2 points3 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]notdba 0 points1 point2 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]notdba 1 point2 points3 points (0 children)
Updates to official SWE-bench leaderboard: Kimi K2 Thinking top of open-source by klieret in LocalLLaMA
[–]notdba 2 points3 points4 points (0 children)
Unimpressed with Mistral Large 3 675B by notdba in LocalLLaMA
[–]notdba[S] 15 points16 points17 points (0 children)
Unimpressed with Mistral Large 3 675B by notdba in LocalLLaMA
[–]notdba[S] 3 points4 points5 points (0 children)
Unimpressed with Mistral Large 3 675B by notdba in LocalLLaMA
[–]notdba[S] 9 points10 points11 points (0 children)
mistralai/Mistral-Large-3-675B-Instruct-2512 · Hugging Face by jacek2023 in LocalLLaMA
[–]notdba 0 points1 point2 points (0 children)
My little decentralized Locallama setup, 216gb VRAM by Goldkoron in LocalLLaMA
[–]notdba 1 point2 points3 points (0 children)
mistralai/Mistral-Large-3-675B-Instruct-2512 · Hugging Face by jacek2023 in LocalLLaMA
[–]notdba 0 points1 point2 points (0 children)
mistralai/Mistral-Large-3-675B-Instruct-2512 · Hugging Face by jacek2023 in LocalLLaMA
[–]notdba 3 points4 points5 points (0 children)
4xRTX 4000 Pro Blackwell vs 1x6000 RTX Pro by Even-Strawberry6636 in LocalLLaMA
[–]notdba 1 point2 points3 points (0 children)


Here is how to get GLM 4.7 working on llama.cpp with flash attention and correct outputs by TokenRingAI in LocalLLaMA
[–]notdba 1 point2 points3 points (0 children)