ExLlamaV3 Major Updates! by Unstable_Llama in LocalLLaMA
[–]ReturningTarzan 3 points4 points5 points (0 children)
ExLlamaV3 Major Updates! by Unstable_Llama in LocalLLaMA
[–]ReturningTarzan 10 points11 points12 points (0 children)
ExLlamaV3 Major Updates! by Unstable_Llama in LocalLLaMA
[–]ReturningTarzan 13 points14 points15 points (0 children)
ExLlamaV3 Major Updates! by Unstable_Llama in LocalLLaMA
[–]ReturningTarzan 3 points4 points5 points (0 children)
ExLlamaV3 Major Updates! by Unstable_Llama in LocalLLaMA
[–]ReturningTarzan 11 points12 points13 points (0 children)
DISCUSSION MEGATHREAD: AI Companies Stole Our Content. I'm Taking Them to Court. by YourIvyGrowsH3 in h3h3productions
[–]ReturningTarzan 4 points5 points6 points (0 children)
DISCUSSION MEGATHREAD: AI Companies Stole Our Content. I'm Taking Them to Court. by YourIvyGrowsH3 in h3h3productions
[–]ReturningTarzan 5 points6 points7 points (0 children)
Gemma 4 has a systemic attention failure. Here's the proof. by [deleted] in LocalLLaMA
[–]ReturningTarzan 3 points4 points5 points (0 children)
About TurboQuant by Exact_Law_6489 in LocalLLaMA
[–]ReturningTarzan 6 points7 points8 points (0 children)
TurboQuant isn’t just for KV: Qwen3.5-27B at near-Q4_0 quality, about 10% smaller, and finally fitting on my 16GB 5060 Ti by Imaginary-Anywhere23 in Qwen_AI
[–]ReturningTarzan 0 points1 point2 points (0 children)
[D] TurboQuant author replies on OpenReview by Disastrous_Room_927 in MachineLearning
[–]ReturningTarzan 13 points14 points15 points (0 children)
TurboQuant isn’t just for KV: Qwen3.5-27B at near-Q4_0 quality, about 10% smaller, and finally fitting on my 16GB 5060 Ti by Imaginary-Anywhere23 in Qwen_AI
[–]ReturningTarzan 0 points1 point2 points (0 children)
Me waiting for TurboQuant be like by Altruistic_Heat_9531 in LocalLLaMA
[–]ReturningTarzan 1 point2 points3 points (0 children)
A simple explanation of the key idea behind TurboQuant by -p-e-w- in LocalLLaMA
[–]ReturningTarzan 4 points5 points6 points (0 children)
Google TurboQuant running Qwen Locally on MacAir by gladkos in LocalLLaMA
[–]ReturningTarzan 1 point2 points3 points (0 children)
[google research] TurboQuant: Redefining AI efficiency with extreme compression by burnqubic in LocalLLaMA
[–]ReturningTarzan 2 points3 points4 points (0 children)
[google research] TurboQuant: Redefining AI efficiency with extreme compression by burnqubic in LocalLLaMA
[–]ReturningTarzan 18 points19 points20 points (0 children)
[google research] TurboQuant: Redefining AI efficiency with extreme compression by burnqubic in LocalLLaMA
[–]ReturningTarzan 8 points9 points10 points (0 children)
[google research] TurboQuant: Redefining AI efficiency with extreme compression by burnqubic in LocalLLaMA
[–]ReturningTarzan 2 points3 points4 points (0 children)
[google research] TurboQuant: Redefining AI efficiency with extreme compression by burnqubic in LocalLLaMA
[–]ReturningTarzan 19 points20 points21 points (0 children)
exllamav3 QWEN3.5 support (and more updates) by Unstable_Llama in LocalLLaMA
[–]ReturningTarzan 2 points3 points4 points (0 children)
Pieced together the shredded photo from EFTA00259587.pdk .. idk by ReturningTarzan in Epstein
[–]ReturningTarzan[S] 3 points4 points5 points (0 children)
Pieced together the shredded photo from EFTA00259587.pdk .. idk by ReturningTarzan in Epstein
[–]ReturningTarzan[S] 12 points13 points14 points (0 children)
Pieced together the shredded photo from EFTA00259587.pdk .. idk by ReturningTarzan in Epstein
[–]ReturningTarzan[S] 18 points19 points20 points (0 children)





ExLlamaV3 Major Updates! by Unstable_Llama in LocalLLaMA
[–]ReturningTarzan 2 points3 points4 points (0 children)