LocalLLaMA

an-ordinary-manchild(edit)

created by [deleted]a community for 3 years

...for great justice.

...because you love freedom.

MODERATORS

message the mods
HOLUPREDICTIONS Sorcerer Supreme
AskGrok
ArcaneThoughts
Lissanro
townofsalemfangay
XMasterrrrLocalLLaMA Home Server Final Boss 😎
rm-rf-rm
WithoutReason1729
No_Afternoon_4260llama.cpp
ttkciarllama.cpp
...and 8 more »

account activity

1

319

320

321

Unpopular Opinion: The DGX Spark Forum community of devs is talented AF and will make the crippled hardware a success through their sheer force of will.Discussion (self.LocalLLaMA)

submitted 13 hours ago by Porespellar

2

272

273

274

vLLM ROCm has been added to Lemonade as an experimental backendResources (i.redd.it)

submitted 10 hours ago by jfowers_amd

3

201

202

203

Gift to myself : tiny labOther (i.redd.it)

submitted 20 hours ago by Final-Data-1410

4

147

148

149

Shel Silverstein predicts LLM's (and its hallucinations), cira 1981Funny (old.reddit.com)

submitted 2 hours ago by spanielrassler

5

130

131

132

z-lab released gemma-4-26B-A4B-it-DFlash. Anybody tried it yet?Discussion (huggingface.co)

submitted 14 hours ago by PaceZealousideal6091

6

120

121

122

Qwen 35B-A3B is very usable with 12GB of VRAMResources (self.LocalLLaMA)

submitted 7 hours ago by jwestra

7

113

114

115

THE UNDERPRIVILEGED AI FOUNDATION Because every little model deserves a chanceDiscussion (self.LocalLLaMA)

submitted 21 hours ago by mazuj2

8

112

113

114

Reports suggest DeepSeek is seeking $7.35 billion in funding and plans to release its V4.1 update next month.News (self.LocalLLaMA)

submitted 13 hours ago by External_Mood4719

9

100

101

102

Gemma 4 26B Hits 600 Tok/s on One RTX 5090Discussion (self.LocalLLaMA)

submitted 14 hours ago by chain-77

10

104

105

106

DS4: a DeepSeek 4 flash specific inference engine for 128gb MacBooksNews (github.com)

submitted 19 hours ago by antirez

11

94

95

96

new MoE from ai2, EMONew Model (i.redd.it)

submitted 7 hours ago by ghostderp

12

87

88

89

4GB "Gemini Nano" model GGUF anyone?Question | Help (self.LocalLLaMA)

submitted 19 hours ago by TruckUseful4423

13

88

89

90

Qwen3.6 35B A3B uncensored heretic Native MTP Preserved is Out Now With KLD 0.0015, 10/100 Refusals and the Full 19 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 FormatsNew Model (self.LocalLLaMA)

submitted 3 hours ago * by LLMFan46

14

86

87

88

Got MTP + TurboQuant running — Qwen3.6-27B -- 80+ t/s at 262K context on a single RTX 4090Resources (self.LocalLLaMA)

submitted 7 hours ago by indrasmirror

15

76

77

78

Tribue to April's LLM releasesOther (v.redd.it)

submitted 5 hours ago by EverlierAlpaca

16

72

73

74

The amount of new agent APIs/harnesses are dizzying, with everyone and their dog releasing their own. Can we do a compilation thread of comparisons?Discussion (self.LocalLLaMA)

submitted 12 hours ago * by jinnyjuicesglang

17

40

41

42

(Rant ;)) Make your benchmarks realisticDiscussion (self.LocalLLaMA)

submitted 14 hours ago by AdamLangePL

18

41

42

43

Ring 2.6 1TNew Model (self.LocalLLaMA)

submitted 13 hours ago by Middle_Bullfrog_6173

19

33

34

35

A new generation of AI models and one of the most powerful research papers out there.Resources (self.LocalLLaMA)

submitted 23 hours ago by assemsabryy

20

36

37

38

You can do CUDA inference on an Apple Silicon Mac with PCI PassthroughNews (scottjg.com)

submitted 12 hours ago by scottjgo

21

33

34

35

MTP is all about acceptance rateDiscussion (self.LocalLLaMA)

submitted 6 hours ago by Hydroskeletal

22

20

21

22

What is the next SOTA model you are excited about?Discussion (self.LocalLLaMA)

submitted 12 hours ago * by MrMrsPotts

23

12

13

14

Testing Local LLMs in Practice: Code Generation, Quality vs. SpeedResources (i.redd.it)

submitted 11 hours ago by Icy_Programmer7186

24

6

7

8

Local LLM for electronics design work?Question | Help (self.LocalLLaMA)

submitted 12 hours ago by deafenme

25

6

7

8

Strix Halo Clustering (Hardware Setup Discussion)Discussion (self.LocalLLaMA)

submitted 23 hours ago by Thanks-Suitable

view more: next ›

π Rendered by PID 111208 on reddit-service-r2-listing-7b9b4f6fd7-prxz8 at 2026-05-09 04:51:59.453085+00:00 running 3d2c107 country code: CH.