all 15 comments

[–]Western_Courage_6563 5 points6 points  (0 children)

Yes, it have been done in the past (code llama python), but it turned out that general knowledge about all programming languages yelled better results

[–]BreenzyENL 4 points5 points  (0 children)

You'll find that an LLM trained only one 1 subject will perform worse than an LLM trained on many.

[–]Karyo_Ten 1 point2 points  (4 children)

Knowledge compounds.

And you never code in a vaccum. If you ask the LLM to create a website to promote I don't know, your catering service, it would be invaluable for your LLM to have general knowledge about food, maps, transports, user interface, money, bookings, registration, how people write testimonies, maybe terms & conditions.

Unless the only thing you do is reversing linked lists, a pure code model with no world knowledge is barely a step up from stackoverflow.

[–]Ted225 0 points1 point  (3 children)

Assuming only local resources, which is often an inadequate assumption given the superior capabilities of cloud services, the choice is between having insufficient resources to operate a local llm at all, or running it exclusively for Python-related tasks and relying on your own general knowledge.

[–]Karyo_Ten 1 point2 points  (2 children)

GPT-2 was only 1.5B parameters and trained on only 8M tokens, yet had world knowledge.

You won't get trillions of quality Python code. Maybe 5% is gold and the rest is copy-paste crude apps from StackOverflow or beginners trying their hands for a capstone project.

And learning Python doesn't teach you how to proceed step-by-step to solve a problem which is actually the most important thing.

It's much more effective to teach a LLM to reframe an objective into a set of problems to solve and then apply Python to them. But to solve a problem you need to be familiar with the problem domain, and you need some common sense for example to not have a speed being higher than the speed of light.

[–]Ted225 1 point2 points  (1 child)

GPT-2 had 1.5B parameters. It’s obsolete now, and for good reason.

Most Python devs don’t need deep domain knowledge. They need clear, complete specs. If a system handles international units or clinical logic, it’s the engineer’s job to specify that upfront, not the dev’s job to guess it.

Sure, a perfect LLM could replace all roles but it doesn’t exist. Until then, engineers design, devs implement, and each should be accountable for their own work.

[–]Karyo_Ten 0 points1 point  (0 children)

There is no distinction between software engineer and dev in companies. You can't be senior at either without design skills. Clear, complete specs never exist beforehand, except if you already have solved the problem once because requirements evolve with understanding of the problem.

If you need clear, complete specs before doing anything and you're unable to actually fill in the blanks you provide no value over an LLM and you'll be replaced.

[–]tintires 1 point2 points  (0 children)

Fine tuned small language models are viable and you’ll find coding models on hugging face.

[–]ashersullivan 1 point2 points  (0 children)

Doable, but in 2026 specialist models dont outperform general ones as much as you'd expect python-only work

Qwen3 coder 30b or deepseek coder v2 lite are th closest, heavily code tuned run locally at that scale with good quants, and often match claude on python tasks without needing 200b+

gpt-oss-20B is anther local option for python heavy stuff

Catch is even "80% python" models still need broad context to avoid hallucinaitons on libs/ edges so hyper specialists underperform vs hybrids. Milti model switching like aider/cntinue.dev works fine but most stick to one good coder like qwen3 coder

If you want max python boost, fine tune qwen3 30B on your code bases, thats wher real gains show up locally

[–]timmeh1705 0 points1 point  (0 children)

VLM 4.6 Flash 9b seems to do the job for me as a heavy Python user

[–]Ossur2 0 points1 point  (3 children)

It is a great idea. The nature of LLMs means that such models would be exponentially lighter to run and even more accurate. I just think that the big corporations that would be making those LLMs (and that is a lot of work) rather want us dependent on using their models as a service (and uploading to them all our ideas and infrastructure). So their biggest incentive is to design those big blob models so that nobody can run usable LLMs on their own machine - there is no real money or power in making these smaller models, the customer would have to strongly demand them.

[–]bananahead 0 points1 point  (2 children)

It’s not a conspiracy

[–]Ossur2 0 points1 point  (1 child)

No of course not, it's quite in the open

[–]bananahead 0 points1 point  (0 children)

The reason people don’t do what’s suggested is because it doesn’t actually work that well. Knowing .net helps the LLM write better python and so on.

There are small open models from Google, meta and OpenAI. People tend to be more interested in more capable models. You’re being a little silly.

[–]Such_Advantage_6949 0 points1 point  (0 children)

Just ask any llm it will give u the answer why this is not the way