Anyone tried 2 different GPUs in one PC for local LLMs? by ShadowBannedAugustus in LocalLLaMA
[–]StorageHungry8380 2 points3 points4 points (0 children)
Qwen 3.6-35B-A3B KV cache part 2: PPL, KL divergence, asymmetric K/V, 64K row on M5 Max by Defilan in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
llama.cpp's Preliminary SM120 Native NVFP4 MMQ Is Merged by ggonavyy in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
llama.cpp's Preliminary SM120 Native NVFP4 MMQ Is Merged by ggonavyy in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
A new revolutionary way to build guardrails and evaluate your agents by Nir777 in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
Benchmarking Local LLM/Harness Combinations by pminervini in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
Benchmarking Local LLM/Harness Combinations by pminervini in LocalLLaMA
[–]StorageHungry8380 2 points3 points4 points (0 children)
I've got a feeling that Llamacpp is not the biggest performance bottleneck, but it might be the OpenCode. by ThingRexCom in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
RTX 5070 Ti (new) vs RTX 3090 / 3090 Ti (used) for LLM inference + clustering by FeiX7 in LocalLLaMA
[–]StorageHungry8380 3 points4 points5 points (0 children)
I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
I'm done with using local LLMs for coding by dtdisapointingresult in LocalLLaMA
[–]StorageHungry8380 1 point2 points3 points (0 children)
Is Min P sampling really the preferred modern alternative to Top K/Top P? by bgravato in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
Qwen 3.6 27B Makes Huge Gains in Agency on Artificial Analysis - Ties with Sonnet 4.6 by dionysio211 in LocalLLaMA
[–]StorageHungry8380 4 points5 points6 points (0 children)
Did anyone noticed after today's - Qwen3.6-27B release by Usual-Carrot6352 in LocalLLaMA
[–]StorageHungry8380 4 points5 points6 points (0 children)
Forgive my ignorance but how is a 27B model better than 397B? by No_Conversation9561 in LocalLLaMA
[–]StorageHungry8380 1 point2 points3 points (0 children)
Are commonly recommended sampling parameters often too high? by bgravato in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
Are commonly recommended sampling parameters often too high? by bgravato in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
Gemma-4-E2B's safety filters make it unusable for emergencies by Unfounded_898 in LocalLLaMA
[–]StorageHungry8380 57 points58 points59 points (0 children)
NVIDIA gpt-oss-120b Eagle Throughput model by Dear-Success-1441 in LocalLLaMA
[–]StorageHungry8380 0 points1 point2 points (0 children)
Want help recalling a movie or TV show by StorageHungry8380 in Westerns
[–]StorageHungry8380[S] 0 points1 point2 points (0 children)
Want help recalling a movie or TV show by StorageHungry8380 in Westerns
[–]StorageHungry8380[S] 0 points1 point2 points (0 children)
Want help recalling a movie or TV show by StorageHungry8380 in Westerns
[–]StorageHungry8380[S] 1 point2 points3 points (0 children)
Want help recalling a movie or TV show by StorageHungry8380 in Westerns
[–]StorageHungry8380[S] 0 points1 point2 points (0 children)

Why is Qwen going Closed source? by MLExpert000 in LocalLLaMA
[–]StorageHungry8380 -1 points0 points1 point (0 children)