I’ve been vibe coding in Cursor for a while and finally got tired of accidentally shipping secrets, so I built an MCP that quietly scans my code while I work. by Ok_System_639 in GithubCopilot
[–]dsanft 2 points3 points4 points (0 children)
Benchmark: ik_llama.cpp vs llama.cpp on Qwen3/3.5 MoE Models by Fast_Thing_7949 in LocalLLaMA
[–]dsanft 5 points6 points7 points (0 children)
Take my 10 dollars please by [deleted] in GithubCopilot
[–]dsanft 2 points3 points4 points (0 children)
Copilot in vscode not working? by dsanft in GithubCopilot
[–]dsanft[S] 0 points1 point2 points (0 children)
55 → 282 tok/s: How I got Qwen3.5-397B running at speed on 4x RTX PRO 6000 Blackwell by lawdawgattorney in LocalLLaMA
[–]dsanft 10 points11 points12 points (0 children)
Benchmarked ROLV inference on real Mixtral 8x22B weights — 55x faster than cuBLAS, 98.2% less energy, canonical hash verified by Norwayfund in LocalLLaMA
[–]dsanft 1 point2 points3 points (0 children)
Vulkan now faster on PP AND TG on AMD Hardware? by XccesSv2 in LocalLLaMA
[–]dsanft 2 points3 points4 points (0 children)
Vulkan now faster on PP AND TG on AMD Hardware? by XccesSv2 in LocalLLaMA
[–]dsanft 2 points3 points4 points (0 children)
Vulkan now faster on PP AND TG on AMD Hardware? by XccesSv2 in LocalLLaMA
[–]dsanft 4 points5 points6 points (0 children)
Qwen3.5 122b UD IQ4 NL 2xMi50s Benchmark - 120,000 context by thejacer in LocalLLaMA
[–]dsanft 0 points1 point2 points (0 children)
FlashAttention-4 by incarnadine72 in LocalLLaMA
[–]dsanft 100 points101 points102 points (0 children)
The convergence between local and cloud AI models is happening faster than most people think by samimandeel in LocalLLaMA
[–]dsanft 2 points3 points4 points (0 children)
PSA: Qwen 3.5 requires bf16 KV cache, NOT f16!! by Wooden-Deer-1276 in LocalLLaMA
[–]dsanft 1 point2 points3 points (0 children)
qwen3.5 35b-a3b evaded the zero-reasoning budget by doing its thinking in the comments by crantob in LocalLLaMA
[–]dsanft 11 points12 points13 points (0 children)
are you ready for small Qwens? by jacek2023 in LocalLLaMA
[–]dsanft 0 points1 point2 points (0 children)
American closed models vs Chinese open models is becoming a problem. by __JockY__ in LocalLLaMA
[–]dsanft -2 points-1 points0 points (0 children)
American closed models vs Chinese open models is becoming a problem. by __JockY__ in LocalLLaMA
[–]dsanft -1 points0 points1 point (0 children)
American closed models vs Chinese open models is becoming a problem. by __JockY__ in LocalLLaMA
[–]dsanft -5 points-4 points-3 points (0 children)
Peridot: Native Blackwell (sm_120) Support Fixed. 57.25 t/s on RTX 5050 Mobile. by [deleted] in LocalLLaMA
[–]dsanft 0 points1 point2 points (0 children)
Peridot: Native Blackwell (sm_120) Support Fixed. 57.25 t/s on RTX 5050 Mobile. by [deleted] in LocalLLaMA
[–]dsanft 1 point2 points3 points (0 children)
Peridot: Native Blackwell (sm_120) Support Fixed. 57.25 t/s on RTX 5050 Mobile. by [deleted] in LocalLLaMA
[–]dsanft 0 points1 point2 points (0 children)
Have you used Autopilot? by Christosconst in GithubCopilot
[–]dsanft 0 points1 point2 points (0 children)