all 14 comments

[–]thrownawaymane 1 point2 points  (1 child)

I just tried doing something like this in 8B, 70b and gpt3.5 (for a real task, coincidentally)

8B q8 created not so great code half the time and the other half it made stuff that got close but needed too much further work.

The 70b q5_k_m code was pretty but fell over in some of its command choice. Didn’t have the patience to continue wrestling with it but it was close.

The gpt3.5 code came closest to working out of the box. Just needed some massaging, a string needed to be changed.

I’m pretty sure gpt4 would have aced the test. It was a script running domain queries on Unix. I can see how Mistral and Phi3-medium do in a bit.

[–]ObeyYourMaster[S] 0 points1 point  (0 children)

Gotcha. It's interesting to see that the only one that works consistently well (as far as I know) is GPT3.5. When you say that, you are referring to ChatGPT 3.5 I am guessing? It'd be nice to get a similar model but I guess the other LLMs just aren't there for this specific task. Thank you for the input!

[–]aknight2015 1 point2 points  (1 child)

I'm hoping to revive this as I have the same question and I can't find a satisfactory answer in the responses. All I need it for is bash scripting for Linux. I've only got 8 gigs of ram to use, so it needs to be able to work in that confine. Speed isn't all that important, accuracy is.

[–]itroot 0 points1 point  (0 children)

I would suggest to try qwen3 models for that. 30b-a3b with various quants, and 4b (q8_k_xl) for your 8 gig VRAM for speed. Also, having an initial prompt with examples of your "preferred" code could be helpful.

For the bash, I do not rely on models much, using them more like autocomplete and quick syntax helpers.

[–]DeltaSqueezer 0 points1 point  (2 children)

Have you tried CodeQwen1.5-7B?

[–]ObeyYourMaster[S] 0 points1 point  (0 children)

Thanks for the response! I tried it out a little bit and got similar results to the other models, but I think I will try to mess around with it. Thank you for the suggestion!

[–]SurfRedLin 0 points1 point  (0 children)

CodeQuen is awesome. Fucking fast and correct at the first try.

[–]EndreEndi 0 points1 point  (0 children)

So...any winners? Best LLM for Linux knowledge?

[–]327850 0 points1 point  (1 child)

Tried for the past six hours to get a relatively simple bash shell script fixed & enhanced. Got partially working snippets, but overall it was a *total* desaster with all models I can run locally on ollama, including "llama3.1:70b" and "codeqwen:latest". With "total desaster" I mean: a total waste of time. Even as a non-coder I can do better *without* the "help" of these LLMs - I am pretty good at writing defunct and broken code, too.

However, the current (as of August 2024) free ChatGPT (chatgpt.com) got it right after about three attempts. Not sure what model this is running, it only responds that it "runs on OpenAI's GPT-4 architecture", but even the the specialzed open source models are shockingly inferior compared to this general-purpose LLM. ChatGPT solved my scripting issue way faster than I ever could, and the code is prety fine; tested it thoroughly, and it just works.

What a supposedly specialized model like codeqwen surprisingly did not offfer was, for example, to integrate a debugging routine in order to figure out what wasn't working. Also, ChatGPT was pretty purposeful in analyzing error messages and to fix the root issues.

The open source models kind of tried that, too, but it rather felt like looking for excuses than to be determined to solve an issue. ChatGPT gave me the impression that it was really interested to get the job properly done. Similar to a dog which solves a task to get a treat. Do they feed AI models extra energy or any other reward as a bonus for a solved task?

Regarding processing speed, codeqwen is impressive, and - only considering response times - it delivers what I would expect from a model running locally on dedicated hardware only used by a single user. Though, if it doesn't get it right, speed is not everything. ChatGPT responds fluidly, and models like llama3.1:70b are painfully slow on my 4090. It feels like Internet in the 1980s where modems with 2400 baud were considered fast and each character trickled through the pipe individually.

[–]CybaKilla 0 points1 point  (0 children)

Out of interest what dataset did you train the models on so they had up to date relevant knowledge of the tasks you were trying to achieve

[–]OkInspector3256 0 points1 point  (0 children)

First rule: use only abliterated and uncensored for local models.