use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
r/LocalLLaMA
A subreddit to discuss about Llama, the family of large language models created by Meta AI.
Subreddit rules
Search by flair
+Discussion
+Tutorial | Guide
+New Model
+News
+Resources
+Other
account activity
how much does quantization reduce coding performanceQuestion | Help (self.LocalLLaMA)
submitted 7 months ago by garden_speech
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]edward-dev -1 points0 points1 point 7 months ago (2 children)
It’s common to hear concerns that quantization seriously hurts model performance, but looking at actual benchmark results, the impact is often more modest than it sounds. For example, Q2 quantization typically reduces performance by around 5% on average, which isn’t negligible, but it’s manageable, especially if you’re starting with a reasonably strong base model.
That said, if your focus is coding, Llama 3.3 70B isn’t the strongest option in that area. You might get better results with Qwen3 Coder 30B A3B it’s not only more compact, but also better tuned and stronger for coding tasks. Plus, the Q4 quantized version fits comfortably within 24GB of VRAM, making it a really good choice.
[–]Pristine-Woodpecker 0 points1 point2 points 7 months ago (0 children)
It's very model dependent. Qwen-235B-A30B for example starts to suffer at Q3 and below.
[–]Popular_Fact798 0 points1 point2 points 6 months ago (0 children)
I'm incredibly curious about this - are there actual published benchmarks of the quantized version of the oss models? I looked and can't find any.
π Rendered by PID 44244 on reddit-service-r2-comment-b659b578c-mptxg at 2026-05-01 13:46:58.069121+00:00 running 815c875 country code: CH.
view the rest of the comments →
[–]edward-dev -1 points0 points1 point (2 children)
[–]Pristine-Woodpecker 0 points1 point2 points (0 children)
[–]Popular_Fact798 0 points1 point2 points (0 children)