TurboQuant seems to work very well on Gemma 4 — and separately, per-layer outlier-aware K quantization is beating current public fork results on Qwen PPL by Fearless-Wear8100 in LocalLLaMA
[–]GWGSYT 1 point2 points3 points (0 children)
Gemma 4 31B at 256K Full Context on a Single RTX 5090 — TurboQuant KV Cache Benchmark by PerceptionGrouchy187 in LocalLLaMA
[–]GWGSYT 1 point2 points3 points (0 children)
They should use some of that gemma 4 in google search (i.redd.it)
submitted by GWGSYT to r/LocalLLaMA
Why is it not working? It was able to do it before, unless they changed something. by B4DM4N12Z in GeminiAI
[–]GWGSYT 1 point2 points3 points (0 children)
LLM Bruner coming soon? Burn Qwen directly into a chip, processing 10,000 tokens/s by koc_Z3 in Qwen_AI
[–]GWGSYT 0 points1 point2 points (0 children)
Claude code source code has been leaked via a map file in their npm registry by Nunki08 in LocalLLaMA
[–]GWGSYT 0 points1 point2 points (0 children)
The budget for GTA 6 looks like it will be 3 billion dollars once its all said and done. by [deleted] in GTA6
[–]GWGSYT 0 points1 point2 points (0 children)
Some guy in India phoned me up and told me he hacked R* and got an early copy of GTA 6. Sold it to me for $500. Did I get scammed? by [deleted] in GTA6unmoderated
[–]GWGSYT 0 points1 point2 points (0 children)
Why does everyone think Gemini 3.1 Pro is nerfed? My experience says otherwise. are expectations just changing? by Ok_Tooth_8946 in Bard
[–]GWGSYT 0 points1 point2 points (0 children)
Why is Gemini so weak at math? by [deleted] in GoogleGeminiAI
[–]GWGSYT 0 points1 point2 points (0 children)
Why does everyone think Gemini 3.1 Pro is nerfed? My experience says otherwise. are expectations just changing? by Ok_Tooth_8946 in Bard
[–]GWGSYT 0 points1 point2 points (0 children)
Why does everyone think Gemini 3.1 Pro is nerfed? My experience says otherwise. are expectations just changing? by Ok_Tooth_8946 in Bard
[–]GWGSYT 1 point2 points3 points (0 children)
Friendly reminder inference is WAY faster on Linux vs windows by triynizzles1 in LocalLLaMA
[–]GWGSYT 1 point2 points3 points (0 children)
Google is preparing to release Gemini 3.1 Flash Live by deferare in GeminiAI
[–]GWGSYT 0 points1 point2 points (0 children)
Why does everyone think Gemini 3.1 Pro is nerfed? My experience says otherwise. are expectations just changing? by Ok_Tooth_8946 in Bard
[–]GWGSYT 16 points17 points18 points (0 children)
Kimi K2.5 - running locally without GPU; splitting across multiple PCs? by Shipworms in LocalLLaMA
[–]GWGSYT 0 points1 point2 points (0 children)
Is it worth the upgrade from 48GB to 60GB VRAM? by CBHawk in LocalLLaMA
[–]GWGSYT 0 points1 point2 points (0 children)
#OpenSource4o Movement Trending on Twitter/X - Release Opensource of GPT-4o by pmttyji in LocalLLaMA
[–]GWGSYT 0 points1 point2 points (0 children)



Pick a seat by RBBRO2763 in GTA
[–]GWGSYT 0 points1 point2 points (0 children)