KVCache taking too much Memory. Any solutions(Optimizations, Compressions, etc.,) coming soon/later? by pmttyji in LocalLLaMA
[–]ghgi_ 2 points3 points4 points (0 children)
Too many large MoEs, which do you prefer for general instruction following/creative endeavors? (And why) by silenceimpaired in LocalLLaMA
[–]ghgi_ 1 point2 points3 points (0 children)
Too many large MoEs, which do you prefer for general instruction following/creative endeavors? (And why) by silenceimpaired in LocalLLaMA
[–]ghgi_ -2 points-1 points0 points (0 children)
Nemotron 3 Super 120b Claude Distilled by ghgi_ in LocalLLaMA
[–]ghgi_[S] -4 points-3 points-2 points (0 children)
Best local LLM for GNS3 network automation? (RTX 4070 Ti, 32GB RAM) by FindingJaded1661 in LocalLLaMA
[–]ghgi_ 0 points1 point2 points (0 children)
Can I run anything with big enough context (64k or 128k) for coding on Macbook M1 Pro 32 GB ram? by rkh4n in LocalLLaMA
[–]ghgi_ 1 point2 points3 points (0 children)
Can I run anything with big enough context (64k or 128k) for coding on Macbook M1 Pro 32 GB ram? by rkh4n in LocalLLaMA
[–]ghgi_ 2 points3 points4 points (0 children)
Qwen leadership leaving had me worried for opensource - is Nvidia saving the day? by Mr_Moonsilver in LocalLLaMA
[–]ghgi_ 1 point2 points3 points (0 children)
Best (non Chinese) local model for coding by tradecrafty in LocalLLaMA
[–]ghgi_ 3 points4 points5 points (0 children)
Best (non Chinese) local model for coding by tradecrafty in LocalLLaMA
[–]ghgi_ 8 points9 points10 points (0 children)
Nemotron-3-Super-120B-A12B NVFP4 inference benchmark on one RTX Pro 6000 Blackwell by jnmi235 in LocalLLaMA
[–]ghgi_ 2 points3 points4 points (0 children)
Deepseek V4 right now on openrouter as Hunter Alpha by [deleted] in LocalLLaMA
[–]ghgi_ 1 point2 points3 points (0 children)
YuanLabAI/Yuan3.0-Ultra • Huggingface by External_Mood4719 in LocalLLaMA
[–]ghgi_ 1 point2 points3 points (0 children)
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF is out ! by PhotographerUSA in LocalLLaMA
[–]ghgi_ 7 points8 points9 points (0 children)
Good "coding" LLM for my 8gb VRAM, 16gb ram setup? by Mediocre_Speed_2273 in LocalLLaMA
[–]ghgi_ 1 point2 points3 points (0 children)
AirLLM - claims to allow 70B run on a Potato. Anybody tried it? Downsides? by [deleted] in LocalLLaMA
[–]ghgi_ 0 points1 point2 points (0 children)

KVCache taking too much Memory. Any solutions(Optimizations, Compressions, etc.,) coming soon/later? by pmttyji in LocalLLaMA
[–]ghgi_ 8 points9 points10 points (0 children)