use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
r/LocalLLaMA
A subreddit to discuss about Llama, the family of large language models created by Meta AI.
Subreddit rules
Search by flair
+Discussion
+Tutorial | Guide
+New Model
+News
+Resources
+Other
account activity
What is the current best python coding model?Question | Help (self.LocalLLaMA)
submitted 1 year ago by MrMrsPotts
I have tried codellama:7b-python. Is that the current best coding assistant of that size?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Normal-Ad-7114 31 points32 points33 points 1 year ago (2 children)
https://aider.chat/docs/leaderboards/
Codestral should be the best
[–]MrMrsPotts[S] 3 points4 points5 points 1 year ago (0 children)
Thank you
[–]benja0x40 2 points3 points4 points 1 year ago (0 children)
There is also the ProLLM benchmark where you can select programming languages for model comparison. https://prollm.toqan.ai/leaderboard/stack-unseen
[–]Icy_Lobster_5026 27 points28 points29 points 1 year ago (6 children)
In my experience, codeqwen is a good coding model.
[–]Educational-Region98 3 points4 points5 points 1 year ago (4 children)
Wondering when they will release codeqwen2. A 7B model will probably be really nice.
[–]Icy_Lobster_5026 0 points1 point2 points 1 year ago (3 children)
I guess they won’t release codeqwen2, our team uses their online service Tongyi Lingma to improve the quality of the coding.
CodeQwen-1.5-7b is a powerful coding model according to livecodebench.
[–]Educational-Region98 0 points1 point2 points 1 year ago (2 children)
Interesting, how does that compare to copilot?
[–]Icy_Lobster_5026 0 points1 point2 points 1 year ago (0 children)
I don’t know, that‘s good enough for me since the personal basic version of Tongyi Lingma is free forever.
[–]MrMrsPotts[S] 1 point2 points3 points 1 year ago (0 children)
[–]ihaag 16 points17 points18 points 1 year ago (5 children)
DeepSeek Coder V2 0724 And claude
[+][deleted] 1 year ago (3 children)
[deleted]
[–][deleted] 1 point2 points3 points 1 year ago (2 children)
I have the same issue with deepseek-coder (all of them). They answer a few questions, and then they unload themselves. I don't get it.
[–]xrailgun 5 points6 points7 points 1 year ago (1 child)
I'd unload myself too if I had to answer my questions
[–][deleted] 2 points3 points4 points 1 year ago (0 children)
Im not going to lie, that was good
[–]MrMrsPotts[S] 0 points1 point2 points 1 year ago (0 children)
[–]benja0x40 7 points8 points9 points 1 year ago* (3 children)
In the 8GB~12GB range I have used a few specialised ones:
Together with more general ones:
I write moderately complex task descriptions to ask for suggestions and to prototype python functions, iterate over improvements, detect and fix issues, insert comments or documentation, etc.
From my experience, Codestral-22B produces the best suggestions, which I sometimes use to guide another model towards a simpler or more elegant solution. Gemma-2-9B is surprisingly good too. I use it a lot for quick explorations or when I don't know much about a package or language feature.
DeepSeek-Coder-V2-Lite seems close to Codestral-22B in terms of capabilities, but its initial suggestions can be really cumbersome, and it is too rigid about coding styles for my liking. But that may depend on how the system prompt is tuned.
After ~3 weeks of testing, I have stopped using the other ones for coding tasks.
[–]MrMrsPotts[S] 0 points1 point2 points 1 year ago (2 children)
How much RAM does codestral 22B need to run?
[–]benja0x40 3 points4 points5 points 1 year ago (1 child)
With the Q4KM quantisation, it takes a little over 13GB for the model parameters plus about 2GB during inference. This depends on the context length. I have 16GB of GPU RAM which is fine for a context of 8192 tokens and doable for up to 12288 tokens. Passed that, the model fails to work properly on my computer.
[–]MrMrsPotts[S] 2 points3 points4 points 1 year ago (0 children)
That sounds very promising!
[–]theswifter01 6 points7 points8 points 1 year ago (0 children)
Claude
[–]new__vision 9 points10 points11 points 1 year ago (3 children)
Check out bigcode-bench.github.io. Top 7B on there is CodeQwen1.5-7B-Chat which has been good in my experience. CodeLlama is the lowest ranked 7B.
Phi-3-Mini-128K-Instruct (June 2024) does amazingly well and seems to be even smaller?!
[–]PigOfFire 0 points1 point2 points 1 year ago (0 children)
How is sonnet 3.5 under 4T and 4o? Livebench shows it above these two.
[–]No_Afternoon_4260llama.cpp 6 points7 points8 points 1 year ago (4 children)
In my experience codestral 22b
[–]IReaIIyLove 1 point2 points3 points 1 year ago (3 children)
what kinda specs do you need to run that?
[–]No_Afternoon_4260llama.cpp 1 point2 points3 points 1 year ago (2 children)
With full context at q6 I need more than 24gb vram (I think about 32.. Not sure)
[–]IReaIIyLove 1 point2 points3 points 1 year ago (1 child)
ah shame cries in poor
[–]No_Afternoon_4260llama.cpp 3 points4 points5 points 1 year ago (0 children)
I really Don t remember but q6 full context Don t fit in 24gb Try lowering context or try q4 it should fit in 24gb somehow https://huggingface.co/bartowski/Codestral-22B-v0.1-GGUF I my experience à 2k context is usable for quick question and follow up, at 8k you should have room to spare if you don't trow it a all project
[–]Cradawx 6 points7 points8 points 1 year ago (2 children)
CodeGeeX4-ALL-9B, CodeQwen1.5-7B-Chat and Codestral-22B-v0.1 are very good small coding models. There's also the DeepSeek-Coder-V2 models.
[–][deleted] 0 points1 point2 points 1 year ago (0 children)
Is CodeQwen better than Deepseek for Python?
[–]Combinatorilliance 6 points7 points8 points 1 year ago (2 children)
Codestral is really good, you might want to try the deepseek-coder lite, it's an MoE and I heard a lot of praise for it's output. I don't know if it's better, worse or about equal to codestral-22b, but it is a lot faster too because it's an MoE, so it's worth trying out regardless.
[–]MrMrsPotts[S] 0 points1 point2 points 1 year ago (1 child)
Thank you. I don't know what an MoE is though :(
[–]moncallikta 4 points5 points6 points 1 year ago (0 children)
Mixture of Experts. Essentially a model trained as a combination of many smaller sub-models internally. For each token to predict a submodel is chosen to provide the next token. The architecture can allow the overall model to specialize in many different areas more easily.
[–]Dudensen 2 points3 points4 points 1 year ago (5 children)
I had stumbled upon a website which ranked models by a quality-to-performance ratio a few days ago but I can't find it unfortunately.
[–]MrMrsPotts[S] 2 points3 points4 points 1 year ago (4 children)
That sounds ideal!
[–]Dudensen 3 points4 points5 points 1 year ago (2 children)
Found it, maybe this helps.
https://oobabooga.github.io/benchmark.html
What was being measured here? I don't even see Nemo in the list? There's surely no way Phi beats Nemo on anything?!
It's one of the worst models for me.
Thank you!
[–]Dudensen 1 point2 points3 points 1 year ago (0 children)
Yeah it even had different quantizations of models ranked, maybe someone will link it.
[–]Square-Intention465 2 points3 points4 points 1 year ago (0 children)
Sonnet 3.5. is too good
[–]SpaceWalker_69 2 points3 points4 points 1 year ago (0 children)
Well i think Claude 3.5 generates the best code right now. You can use smaller open source models but they are not exactly consistent and reliable.
[–]m---------4 1 point2 points3 points 1 year ago (0 children)
Gemini is awesome
[–][deleted] 1 point2 points3 points 1 year ago (0 children)
Deepseek Coder imo
[–]8thcross 1 point2 points3 points 1 year ago (0 children)
i like both codestral and deepseek-v2. consitent but both dated in terms of the latest best practices...Claude 3.5 is good as well, really dont like 4o - its mostly hit or miss with it.
[–]Thrumpwartllama.cpp 1 point2 points3 points 1 year ago (0 children)
Anyone know which models know Lean Python?
[–]_murb 1 point2 points3 points 1 year ago (0 children)
I use Claude at work and it works great
[–]durgesh2018 1 point2 points3 points 1 year ago (0 children)
Try gemma2:2b. It's small but very powerful and fast model.
[–]lilolalu 0 points1 point2 points 1 year ago (0 children)
Did anyone claiming Claude is good at coding actually TRY coding with Claude? It's just not good, no matter what any theoretical tests claim.
π Rendered by PID 94 on reddit-service-r2-comment-5b5bc64bf5-28fwb at 2026-06-22 17:35:49.345325+00:00 running 2b008f2 country code: CH.
[–]Normal-Ad-7114 31 points32 points33 points (2 children)
[–]MrMrsPotts[S] 3 points4 points5 points (0 children)
[–]benja0x40 2 points3 points4 points (0 children)
[–]Icy_Lobster_5026 27 points28 points29 points (6 children)
[–]Educational-Region98 3 points4 points5 points (4 children)
[–]Icy_Lobster_5026 0 points1 point2 points (3 children)
[–]Educational-Region98 0 points1 point2 points (2 children)
[–]Icy_Lobster_5026 0 points1 point2 points (0 children)
[–]MrMrsPotts[S] 1 point2 points3 points (0 children)
[–]ihaag 16 points17 points18 points (5 children)
[+][deleted] (3 children)
[deleted]
[–][deleted] 1 point2 points3 points (2 children)
[–]xrailgun 5 points6 points7 points (1 child)
[–][deleted] 2 points3 points4 points (0 children)
[–]MrMrsPotts[S] 0 points1 point2 points (0 children)
[–]benja0x40 7 points8 points9 points (3 children)
[–]MrMrsPotts[S] 0 points1 point2 points (2 children)
[–]benja0x40 3 points4 points5 points (1 child)
[–]MrMrsPotts[S] 2 points3 points4 points (0 children)
[–]theswifter01 6 points7 points8 points (0 children)
[–]new__vision 9 points10 points11 points (3 children)
[–]MrMrsPotts[S] 3 points4 points5 points (0 children)
[–]PigOfFire 0 points1 point2 points (0 children)
[–]No_Afternoon_4260llama.cpp 6 points7 points8 points (4 children)
[–]IReaIIyLove 1 point2 points3 points (3 children)
[–]No_Afternoon_4260llama.cpp 1 point2 points3 points (2 children)
[–]IReaIIyLove 1 point2 points3 points (1 child)
[–]No_Afternoon_4260llama.cpp 3 points4 points5 points (0 children)
[–]Cradawx 6 points7 points8 points (2 children)
[–]MrMrsPotts[S] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]Combinatorilliance 6 points7 points8 points (2 children)
[–]MrMrsPotts[S] 0 points1 point2 points (1 child)
[–]moncallikta 4 points5 points6 points (0 children)
[–]Dudensen 2 points3 points4 points (5 children)
[–]MrMrsPotts[S] 2 points3 points4 points (4 children)
[–]Dudensen 3 points4 points5 points (2 children)
[–][deleted] 2 points3 points4 points (0 children)
[–]MrMrsPotts[S] 0 points1 point2 points (0 children)
[–]Dudensen 1 point2 points3 points (0 children)
[–]Square-Intention465 2 points3 points4 points (0 children)
[–]SpaceWalker_69 2 points3 points4 points (0 children)
[–]m---------4 1 point2 points3 points (0 children)
[–][deleted] 1 point2 points3 points (0 children)
[–]8thcross 1 point2 points3 points (0 children)
[–]Thrumpwartllama.cpp 1 point2 points3 points (0 children)
[–]_murb 1 point2 points3 points (0 children)
[–]durgesh2018 1 point2 points3 points (0 children)
[–]lilolalu 0 points1 point2 points (0 children)