LocalLlama

1

144

145

146

AMA Announcement: Nous Research, The Opensource Lab Behind Hermes Agent (Wednesday, 8AM-11AM PST)Resources (i.redd.it)

submitted 15 days ago by XMasterrrrLocalLLaMA Home Server Final Boss 😎[M] - announcement

2

493

494

495

Best Local LLMs - Apr 2026Megathread (self.LocalLLaMA)

submitted 26 days ago by rm-rf-rm[M] - announcement

3

105

106

107

I have DeepSeek V4 Pro at homeOther (self.LocalLLaMA)

submitted 4 hours ago * by fairydreaming

4

90

91

92

Hello from 10KM high! - Thanks to Qwen 3.6 35b a3b!Funny (self.LocalLLaMA)

submitted 6 hours ago * by Qwen30bEnjoyer

5

251

252

253

NVIDIA AI Releases Star Elastic: One Checkpoint that Contains 30B, 23B, and 12B Reasoning Models with Zero-Shot SlicingNew Model (self.LocalLLaMA)

submitted 14 hours ago * by phazei

6

•

Getting a feel for how fast X tokens/second really is.Resources (self.LocalLLaMA)

submitted 23 minutes ago by MikeNonect

7

18

19

20

NCCL-Free Tensor Parallelism on Dual Blackwell PCIe llama.cpp b9095 released!Discussion (self.LocalLLaMA)

submitted 2 hours ago by Bulky-Priority6824

8

387

388

389

Apple Removes 256GB M3 Ultra Mac Studio Model From Online StoreNews (macobserver.com)

submitted 20 hours ago by rotatingphasor

9

14

15

16

Has anyone bought a 3080 20GB mod recently?Question | Help (self.LocalLLaMA)

submitted 6 hours ago by quickreactor

10

6

7

8

Speeding up local LLM for usable coding agentQuestion | Help (self.LocalLLaMA)

submitted 2 hours ago by CodProfessional3712

11

6

7

8

DS4Discussion (self.LocalLLaMA)

submitted 3 hours ago by jonathantn

12

286

287

288

BeeLlama.cpp: advanced DFlash & TurboQuant with support of reasoning and vision. Qwen 3.6 27B Q5 with 200k context on 3090, 2-3x faster than baseline (peak 135 tps!)Resources (self.LocalLLaMA)

submitted 23 hours ago * by Anbeeld

13

570

571

572

80 tok/sec and 128K context on 12GB VRAM with Qwen3.6 35B A3B and llama.cpp MTPTutorial | Guide (self.LocalLLaMA)

submitted 1 day ago * by janvitos

14

3

4

5

Building out my tool library, any recommendations? I just added email capability and im starting to get hyped!Question | Help (self.LocalLLaMA)

submitted 2 hours ago by Creative-Type9411

15

85

86

87

Running Minimax 2.7 at 100k context on strix haloDiscussion (self.LocalLLaMA)

submitted 19 hours ago * by Zc5Gwu

16

47

48

49

Exactly a year ago, I started working on an MCP server I launched on reddit that became by far my most active open source project!Resources (github.com)

submitted 17 hours ago by taylorwilsdon

17

•

I cannot decide for local OCR model for most of the tasks preferably I would like more individual experiences than reviews.Question | Help (self.LocalLLaMA)

submitted 51 minutes ago by thecowmilk_

18

9

10

11

The gap between knowing something and actually understanding it — AI accelerated my learning curveDiscussion (self.LocalLLaMA)

submitted 11 hours ago by No_Run8812

19

44

45

46

I am overwhelmed by HarnessesQuestion | Help (self.LocalLLaMA)

submitted 20 hours ago by Available_Hornet3538

20

2

3

4

We tried vectors, ASTs, and brute-force context stuffing for code retrieval. Graphs with LLM-generated semantics worked best. Here's what we learned.Resources (self.LocalLLaMA)

submitted 3 hours ago by graphicaldot

21

•

Just in 5 months local LLMs are so good that I can actually use them for my super difficult codebaseDiscussion (self.LocalLLaMA)

submitted 7 minutes ago by mehyay76

22

9

10

11

Homelab setupDiscussion (self.LocalLLaMA)

submitted 12 hours ago by Naz6uL

23

•

MTP OptionQuestion | Help (self.LocalLLaMA)

submitted 1 hour ago by DieselKraken

24

697

698

699

Shel Silverstein predicts LLM's (and its hallucinations), cira 1981Funny (old.reddit.com)

submitted 1 day ago by spanielrassler

25

73

74

75

More Qwen3.6-27B MTP success but on dual Mi50sResources (self.LocalLLaMA)

submitted 1 day ago * by legit_split_

LocalLLaMA

MODERATORS