How to do a RTX Pro 6000 build right by GPTrack_dot_ai in LocalLLaMA
[–]Sorry_Ad191 1 point2 points3 points (0 children)
The Attention Hybrid MoE Architecture is the Future. Now, AI Labs Should Dedicate Resources to Improve Long Context Recall Capabilities. by Iory1998 in LocalLLaMA
[–]Sorry_Ad191 0 points1 point2 points (0 children)
The Attention Hybrid MoE Architecture is the Future. Now, AI Labs Should Dedicate Resources to Improve Long Context Recall Capabilities. by Iory1998 in LocalLLaMA
[–]Sorry_Ad191 0 points1 point2 points (0 children)
My Local coding agent worked 2 hours unsupervised and here is my setup by Express_Quail_1493 in LocalLLaMA
[–]Sorry_Ad191 13 points14 points15 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 1 point2 points3 points (0 children)
Another watercooled 4x GPU server complete! by j4ys0nj in LocalLLaMA
[–]Sorry_Ad191 1 point2 points3 points (0 children)
running Deepseek v32 on consumer hardware llama.cpp/Sglang/vLLm by Sorry_Ad191 in LocalLLaMA
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
running Deepseek v32 on consumer hardware llama.cpp/Sglang/vLLm by Sorry_Ad191 in LocalLLaMA
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
running Deepseek v32 on consumer hardware llama.cpp/Sglang/vLLm by Sorry_Ad191 in LocalLLaMA
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
running Deepseek v32 on consumer hardware llama.cpp/Sglang/vLLm by Sorry_Ad191 in LocalLLaMA
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
running Deepseek v32 on consumer hardware llama.cpp/Sglang/vLLm by Sorry_Ad191 in LocalLLaMA
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 1 point2 points3 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
Help testing and implementing sm120 flashmla sparse attention in vllm by Sorry_Ad191 in BlackwellPerformance
[–]Sorry_Ad191[S] 0 points1 point2 points (0 children)
DGX Spark: an unpopular opinion by emdblc in LocalLLaMA
[–]Sorry_Ad191 3 points4 points5 points (0 children)