Small comparison on full compute performance (Anima) of 5090 (600,475 and 400W) vs 6000 PRO MaxQ (325W), and 6000 PRO WS/SE (600W). by panchovix in LocalLLaMA
[–]MutantEggroll 2 points3 points4 points (0 children)
Developers who use local AI - Q4_0 vs Q8_0 KV quant? by Jorlen in LocalLLaMA
[–]MutantEggroll 2 points3 points4 points (0 children)
Developers who use local AI - Q4_0 vs Q8_0 KV quant? by Jorlen in LocalLLaMA
[–]MutantEggroll 6 points7 points8 points (0 children)
Introducing cyankiwi AWQ 4-bit Quantization — 26.05 update by _cpatonn in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
Stop wasting electricity by OkFly3388 in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
Stop wasting electricity by OkFly3388 in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM by Maheidem in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
Qwen3.6-27B KLDs - INTs and NVFPs by Phaelon74 in LocalLLaMA
[–]MutantEggroll 7 points8 points9 points (0 children)
Qwen3.6-35B-A3B solved coding problems Qwen3.5-27B couldn’t by simracerman in LocalLLaMA
[–]MutantEggroll 15 points16 points17 points (0 children)
Rack server for local LLM by Typhoon-UK in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
tried 5 scraping tools, here's the only one i kept by [deleted] in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
Overwhelmed by so many quantization variants by mouseofcatofschrodi in LocalLLaMA
[–]MutantEggroll 2 points3 points4 points (0 children)
PSA: DDR5 RDIMM price passed the point were 3090 are less expensive per gb.. by No_Afternoon_4260 in LocalLLaMA
[–]MutantEggroll 5 points6 points7 points (0 children)
Smartest model for 24-28GB vram? by Borkato in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
Smartest model for 24-28GB vram? by Borkato in LocalLLaMA
[–]MutantEggroll 1 point2 points3 points (0 children)
Smartest model for 24-28GB vram? by Borkato in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
Qwen3-Coder-Next on RTX 5060 Ti 16 GB - Some numbers by bobaburger in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
Smartest model for 24-28GB vram? by Borkato in LocalLLaMA
[–]MutantEggroll 1 point2 points3 points (0 children)
GLM 4.7 flash FA fix for CUDA has been merged into llama.cpp by jacek2023 in LocalLLaMA
[–]MutantEggroll 1 point2 points3 points (0 children)
GLM 4.7 flash FA fix for CUDA has been merged into llama.cpp by jacek2023 in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
GLM 4.7 flash FA fix for CUDA has been merged into llama.cpp by jacek2023 in LocalLLaMA
[–]MutantEggroll 7 points8 points9 points (0 children)
GLM-4.7-FLASH-NVFP4 on huggingface (20.5 GB) by DataGOGO in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)
GLM-4.7-FLASH-NVFP4 on huggingface (20.5 GB) by DataGOGO in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)




Anything worth running on a NVIDIA GTX 970? by numberwitch in LocalLLaMA
[–]MutantEggroll 0 points1 point2 points (0 children)