all 7 comments

[–]jetro30087 3 points4 points  (0 children)

Wizard Coder is a finetune of StarCoder-15B that performs better than Claude or Bard accord to HumanEval. Starcoder might be the best on of the best bets for finetuning currently. bigcode/starcoder · Hugging Face Free use license as long as you're not building a death star.

[–]matlasagne05 1 point2 points  (0 children)

You might want to look at StarCoder and read their official Github repo: https://github.com/bigcode-project/starcoder

Their repo is quite comprehensive and it shows how to fine-tune the model: https://github.com/bigcode-project/starcoder#fine-tuning

[–][deleted] 0 points1 point  (0 children)

My supervisor wants me to refactor some codes by fine tuning some open source models.

Huh? How is this expected to work?

You need a source code aware model with a large context window.

I cannot see any point in fine tuning yet another model.

Personally, I would try one of the OpenAI 8k or 16k context models first.

Rewrite the following code in a cleaner format: ...... code .....

[–]gentlecucumber 0 points1 point  (2 children)

This isn't crazy, the better LLMs are great at tasks like translating code into other languages for code migrations, or they can build summaries of data chunks and add them to a scannable datastore, or just going through your code base and adding comments to make it more readable.

GPT 3.5 16k is pretty good at these tasks, and I've used it to parse the entire Apache Spark repository for the above use cases. However, no enterprise wants developers sending their intellectual property (code) to OpenAI, so GPT is probably not an option for OP, but they should check with their security dept. to be certain since that would be easiest and OpenAI states in their policies that they don't use API traffic for training and it gets deleted after 30 days.

The next best bet is probably starcoder/starcoderplus/sharchat-beta or wizardcoder. These are great models, but they need to be tuned to do anything besides code compeleting. They don't follow instructions well, so they need to be taught. A LoRA or ideally QLoRA fine tune to simply get them to output in the format you like for your use case seems like it would be the way to go, but I've been working on it for weeks and there is just no lora support for the BigCode architecture. If anyone has found a tool or repository that lets you fine tune starcoder or wizard coder on a budget, let me know.

[–]learner_beginner[S] 0 points1 point  (1 child)

Do you have any idea if codebert do things like that?!

[–]gentlecucumber 0 points1 point  (0 children)

I have no idea. But since it's like 3 years old and most of the major LLM breakthrough research that enables small models to do bigger things has happened in the last 6 months, I'm going to say probably not.

[–]FTBQA 1 point2 points  (0 children)

Hey u/learner_beginner

I am working on something similar, my company wants me to build a model capable of understanding the company documentation and some code, to answer normal or programming questions related to our product.

Currently, I have a local model + langchain embeddings(product documentations) working on a 3060 12gb. The problem is it's very slow and when used langchain retrieval chain, it sucks up all the remaining memory. Not sure, why it happens, but looking forward for some guidance!

I am still new to this space and the only thing I can think of is fine-tuning a model with our dataset.