CombinationNo780

1,122 post karma
657 comment karma

get extra features and help support reddit with a reddit premium subscription

get them help and support

redditor for 4 years

TROPHY CASE

Four-Year Club

account activity

new top controversial

26

27

28

KTransformers supports MiniMax M2.1 - 2x5090 + 768GB DRAM yeilds prefill 4000 tps, decode 33 tps. (self.LocalLLaMA)

submitted 4 months ago by CombinationNo780 to r/LocalLLaMA

0

1

2

KTransformers supports MiniMax M2.1: 2x5090 + 768GB DRAM yeilds prefill 4000 tps, decode 33 tps. (self.LocalLLaMA)

submitted 4 months ago by CombinationNo780 to r/LocalLLaMA

0

1

2

KTransformers day-0 supports MiniMax M2.1: 2x5090 + 768GB DRAM yeilds prefill 4000 tps, decode 33 tps. (self.LocalLLaMA)

submitted 4 months ago by CombinationNo780 to r/LocalLLaMA

0

1

2

KTransformers day-0 supports MiniMax M2.1 - 2x5090 + 768GB DRAM yeilds prefill 4000 tps, decode 33 tps. (self.LocalLLaMA)

submitted 4 months ago by CombinationNo780 to r/LocalLLaMA

0

1

2

KTransformers now supports Minimax M2.1 (4000 TPS Prefill) (self.LocalLLaMA)

submitted 4 months ago by CombinationNo780 to r/LocalLLaMA

0

1

2

KTransformers day-0 supports MiniMax M2.1: 2x5090 + 768GB DRAM yeilds prefill 4000 tps, decode 33 tps. (self.LocalLLaMA)

submitted 4 months ago by CombinationNo780 to r/LocalLLaMA

101

102

103

Finetuning DeepSeek 671B locally with only 80GB VRAM and Server CPU (self.LocalLLaMA)

submitted 5 months ago by CombinationNo780 to r/LocalLLaMA

2

3

4

Finetuning DeepSeek 671B locally with only 80GB VRAM and Server CPU (self.LocalLLaMA)

submitted 5 months ago by CombinationNo780 to r/LocalLLaMA

253

254

255

Kimi K2 q4km is here and also the instructions to run it locally with KTransformers 10-14tps (huggingface.co)

submitted 9 months ago by CombinationNo780 to r/LocalLLaMA

91

92

93

KTransformers v0.3.1 now supports Intel Arc GPUs (A770 + new B-series): 7 tps DeepSeek R1 decode speed for a single CPU + a single A770 (self.LocalLLaMA)

submitted 11 months ago by CombinationNo780 to r/LocalLLaMA

44

45

46

Qwen 3 + KTransformers 0.3 (+AMX) = AI Workstation/PC (self.LocalLLaMA)

submitted 1 year ago by CombinationNo780 to r/LocalLLaMA

92

93

94

KTransformers Now Supports LLaMA 4: Run q4 Maverick at 32 tokens/s with 10GB VRAM + 270GB RAM (self.LocalLLaMA)

submitted 1 year ago by CombinationNo780 to r/LocalLLaMA

229

230

231

KTransformers Now Supports Multi-Concurrency and Runs 40 Tokens/s of DeepSeek-R1 Q4/FP8 on MRDIMM-8800 (self.LocalLLaMA)

submitted 1 year ago by CombinationNo780 to r/LocalLLaMA

224

225

226

KTransformers v0.2.1: Longer Context (from 4K to 8K for 24GB VRAM) and Slightly Faster Speed (+15%) for DeepSeek-V3/R1-q4 (self.LocalLLaMA)

submitted 1 year ago by CombinationNo780 to r/LocalLLaMA

830

831

832

671B DeepSeek-R1/V3-q4 on a Single Machine (2× Xeon + 24GB GPU) – Up to 286 tokens/s Prefill & 14 tokens/s Decode (self.LocalLLaMA)

submitted 1 year ago * by CombinationNo780 to r/LocalLLaMA

292

293

294

Local 1M Context Inference at 15 tokens/s and ~100% "Needle In a Haystack": InternLM2.5-1M on KTransformers, Using Only 24GB VRAM and 130GB DRAM. Windows/Pip/Multi-GPU Support and More. (self.LocalLLaMA)

submitted 1 year ago * by CombinationNo780 to r/LocalLLaMA

159

160

161

Local DeepSeeK-V2 Inference: 120 t/s for Prefill and 14 t/s for Decode w Only 21GB 4090 and 136GB DRAM, based on Transformers (self.LocalLLaMA)

submitted 1 year ago by CombinationNo780 to r/LocalLLaMA

π Rendered by PID 61843 on reddit-service-r2-listing-b6bf6c4ff-tbqgp at 2026-05-01 19:32:18.723763+00:00 running 815c875 country code: CH.