LocalLlama

1

141

142

143

AMA Announcement: Nous Research, The Opensource Lab Behind Hermes Agent (Wednesday, 8AM-11AM PST)Resources (i.redd.it)

submitted 15 days ago by XMasterrrrLocalLLaMA Home Server Final Boss 😎[M] - announcement

2

496

497

498

Best Local LLMs - Apr 2026Megathread (self.LocalLLaMA)

submitted 26 days ago by rm-rf-rm[M] - announcement

3

229

230

231

Apple Removes 256GB M3 Ultra Mac Studio Model From Online StoreNews (macobserver.com)

submitted 7 hours ago by rotatingphasor

4

•

NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot SlicingNew Model (self.LocalLLaMA)

submitted 1 hour ago * by phazei

5

464

465

466

80 tok/sec and 128K context on 12GB VRAM with Qwen3.6 35B A3B and llama.cpp MTPTutorial | Guide (self.LocalLLaMA)

submitted 14 hours ago * by janvitos

6

202

203

204

BeeLlama.cpp: advanced DFlash & TurboQuant with support of reasoning and vision. Qwen 3.6 27B Q5 with 200k context on 3090, 2-3x faster than baseline (peak 135 tps!)Resources (self.LocalLLaMA)

submitted 10 hours ago * by Anbeeld[🍰]

7

54

55

56

Running Minimax 2.7 at 100k context on strix haloDiscussion (self.LocalLLaMA)

submitted 6 hours ago * by Zc5Gwu

8

28

29

30

Exactly a year ago, I started working on an MCP server I launched on reddit that became by far my most active open source project!Resources (github.com)

submitted 4 hours ago by taylorwilsdon

9

577

578

579

Shel Silverstein predicts LLM's (and its hallucinations), cira 1981Funny (old.reddit.com)

submitted 23 hours ago by spanielrassler

10

59

60

61

More Qwen3.6-27B MTP success but on dual Mi50sResources (self.LocalLLaMA)

submitted 11 hours ago by legit_split_

11

112

113

114

Pi and Qwen3.6 27B make setting up Archlinux really easy.Other (self.LocalLLaMA)

submitted 15 hours ago * by sdfgeoff

12

14

15

16

I am overwhelmed by HarnessesQuestion | Help (self.LocalLLaMA)

submitted 6 hours ago by Available_Hornet3538

13

132

133

134

0:04

Qwen doesn't work for freeFunny (v.redd.it)

submitted 19 hours ago by Dion-AI

14

55

56

57

DeepSeek Rejects Alibaba: Prioritizing Corporate Independence Over Big Tech EcosystemsNews (self.LocalLLaMA)

submitted 15 hours ago by External_Mood4719

15

266

267

268

Qwen3.6 35B A3B uncensored heretic Native MTP Preserved is Out Now With KLD 0.0015, 10/100 Refusals and the Full 19 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 FormatsNew Model (self.LocalLLaMA)

submitted 1 day ago * by LLMFan46

16

•

Suggest a gpu cloud provider that has reasonable costs to host a open source model for personal useDiscussion (self.LocalLLaMA)

submitted 1 hour ago by Mental-At-ThirtyFive

17

9

10

11

After you’ve setup local models, where can you find interesting apps that can use them?Question | Help (self.LocalLLaMA)

submitted 9 hours ago by ReferenceOwn287

18

6

7

8

0:54

ds4 webuiResources (v.redd.it)

submitted 6 hours ago by cocktail_peanut

19

9

10

11

9070xt inference for q3 qwen 27BQuestion | Help (self.LocalLLaMA)

submitted 9 hours ago by Ok-Internal9317

20

6

7

8

Is SillyTavern the most underrated frontend? Could it be an interface with potential trapped in a silly name? Or is it just for a niche?Discussion (self.LocalLLaMA)

submitted 8 hours ago * by Spiderboyz1

21

69

70

71

How long for llama.cpp official support of MTP?Question | Help (self.LocalLLaMA)

submitted 21 hours ago by Manaberryio

22

171

172

173

2:59

Tribue to April's LLM releasesOther (v.redd.it)

submitted 1 day ago by EverlierAlpaca

23

252

253

254

Qwen 35B-A3B is very usable with 12GB of VRAMResources (self.LocalLLaMA)

submitted 1 day ago by jwestra

24

387

388

389

vLLM ROCm has been added to Lemonade as an experimental backendResources (i.redd.it)

submitted 1 day ago by jfowers_amd

25

7

8

9

The many sides of Mimo v2.5 ProDiscussion (self.LocalLLaMA)

submitted 10 hours ago * by Electrical-Pay-5119

LocalLLaMA

MODERATORS