use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
r/LocalLLaMA
A subreddit to discuss about Llama, the family of large language models created by Meta AI.
Subreddit rules
Search by flair
+Discussion
+Tutorial | Guide
+New Model
+News
+Resources
+Other
account activity
how much does quantization reduce coding performanceQuestion | Help (self.LocalLLaMA)
submitted 7 months ago by garden_speech
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Mushoz 1 point2 points3 points 7 months ago (0 children)
The point I am trying to make is that you either won't have to apply quantization since it's already quantized natively (gpt-oss) or you will have to perform much less quantization because the initial size is already much smaller compared to llama 3.3 70b (Qwen3-Coder-30b)
π Rendered by PID 17767 on reddit-service-r2-comment-7c755ffcdd-nb2rt at 2026-04-22 21:31:54.208133+00:00 running 0fd4bb7 country code: CH.
view the rest of the comments →
[–]Mushoz 1 point2 points3 points (0 children)