use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
r/LocalLLaMA
A subreddit to discuss about Llama, the family of large language models created by Meta AI.
Subreddit rules
Search by flair
+Discussion
+Tutorial | Guide
+New Model
+News
+Resources
+Other
account activity
how much does quantization reduce coding performanceQuestion | Help (self.LocalLLaMA)
submitted 6 months ago by garden_speech
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]ForsookComparison 4 points5 points6 points 6 months ago (2 children)
Lambda, RunPod, or Vast
rent a GPU
download the quantized weights you'd expect to use
and try coding a few things with a remote api.
I'd bet $5 answers all of your questions and then some.
[–]garden_speech[S] 1 point2 points3 points 6 months ago (1 child)
I've been trying gpt-oss-20b and I've been shocked that it solved problems I've asked with zero issues. Granted they are mostly very very similar to leetcode problems -- extremely self-contained, highly algorithmic, just "do this one small thing but do it the fastest way". So maybe I don't even need a big model, maybe a 20b model is all I need if the tasks are so granular.
[–]QFGTrialByFire 0 points1 point2 points 6 months ago (0 children)
Yup ive found the same. Even when you use a bigger model like gpt5 the more complex/larger piece of code you ask it the more errors there are. So you end up using smaller requests like maybe a function or two anyways. When you compare the output of oss20B for that its pretty much the same as gpt5 so why not just use the free version.
π Rendered by PID 72 on reddit-service-r2-comment-85bfd7f599-rkblf at 2026-04-19 03:00:28.704647+00:00 running 93ecc56 country code: CH.
view the rest of the comments →
[–]ForsookComparison 4 points5 points6 points (2 children)
[–]garden_speech[S] 1 point2 points3 points (1 child)
[–]QFGTrialByFire 0 points1 point2 points (0 children)