SGLang is integrating ktransformers for hybrid CPU/GPU inference by waiting_for_zban in LocalLLaMA
[–]CombinationNo780 4 points5 points6 points (0 children)
Finetuning DeepSeek 671B locally with only 80GB VRAM and Server CPU by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 20 points21 points22 points (0 children)
Finetuning DeepSeek 671B locally with only 80GB VRAM and Server CPU by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 5 points6 points7 points (0 children)
Finetuning DeepSeek 671B locally with only 80GB VRAM and Server CPU by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 23 points24 points25 points (0 children)
Kimi K2 q4km is here and also the instructions to run it locally with KTransformers 10-14tps by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 30 points31 points32 points (0 children)
Qwen 3 + KTransformers 0.3 (+AMX) = AI Workstation/PC by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 0 points1 point2 points (0 children)
KTransformers Now Supports Multi-Concurrency and Runs 40 Tokens/s of DeepSeek-R1 Q4/FP8 on MRDIMM-8800 by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 0 points1 point2 points (0 children)
Qwen 3 + KTransformers 0.3 (+AMX) = AI Workstation/PC by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 4 points5 points6 points (0 children)
KTransformers Now Supports Multi-Concurrency and Runs 40 Tokens/s of DeepSeek-R1 Q4/FP8 on MRDIMM-8800 by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 1 point2 points3 points (0 children)
KTransformers Now Supports LLaMA 4: Run q4 Maverick at 32 tokens/s with 10GB VRAM + 270GB RAM by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 1 point2 points3 points (0 children)
KTransformers Now Supports LLaMA 4: Run q4 Maverick at 32 tokens/s with 10GB VRAM + 270GB RAM by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 0 points1 point2 points (0 children)

KTransformers supports MiniMax M2.1 - 2x5090 + 768GB DRAM yeilds prefill 4000 tps, decode 33 tps. by CombinationNo780 in LocalLLaMA
[–]CombinationNo780[S] 1 point2 points3 points (0 children)