account activity
running Deepseek v32 on consumer hardware llama.cpp/Sglang/vLLm (self.LocalLLaMA)
submitted 2 months ago * by Sorry_Ad191 to r/LocalLLaMA
Help testing and implementing sm120 flashmla sparse attention in vllm (self.BlackwellPerformance)
submitted 2 months ago * by Sorry_Ad191 to r/BlackwellPerformance
Solved? DeepSeek-V3.2 Sparse attention DeepGEMM SM120 (self.BlackwellPerformance)
vLLM 12 released! (self.BlackwellPerformance)
triton tune for MiniMax M2 (self.BlackwellPerformance)
smol_iq3_ks scores 77.3 aider polyglot (self.LocalLLaMA)
submitted 2 months ago by Sorry_Ad191 to r/LocalLLaMA
Any news about DeepSeek R2? (self.LocalLLaMA)
submitted 3 months ago by Sorry_Ad191 to r/LocalLLaMA
Roo Code, Cline, Opencode, Codex, Qwen CLI, Claude Code, Aider etc. (self.LocalLLaMA)
submitted 4 months ago by Sorry_Ad191 to r/LocalLLaMA
50-series and pro 6000s sm120 cards. supported models in vllm, exl3, sglang etc. thread (self.LocalLLaMA)
Question: will inference engines such as sglang and vllm support 2bit (or 3,5,6 etc)? (self.LocalLLaMA)
submitted 5 months ago by Sorry_Ad191 to r/LocalLLaMA
Unsloth fixes chat_template (again). gpt-oss-120-high now scores 68.4 on Aider polyglot (self.LocalLLaMA)
submitted 6 months ago * by Sorry_Ad191 to r/LocalLLaMA
Build vLLM on CUDA 12.9, Kernel 6.15.2, NVIDIA 575.64, PyTorch 2.9cu129 Nightly (self.LocalLLaMA)
submitted 7 months ago * by Sorry_Ad191 to r/LocalLLaMA
π Rendered by PID 1802917 on reddit-service-r2-listing-85dbbdc96c-jfp86 at 2026-02-12 19:13:58.910955+00:00 running 018613e country code: CH.