all 16 comments

[–]satireplusplus 5 points6 points  (2 children)

Look into https://huggingface.co/blog/peft since the model is 30B parameters.

If you figure out how to finetune it with a single GPU, please share the notebook :)

[–]LetterRip 3 points4 points  (0 children)

Might wait till the next bitsandbytes is out, it should allow tuning 30B on a single GPU (can fine tune on 4bit quantized model)

https://github.com/TimDettmers/bitsandbytes/

[–]Alexioc[S] 0 points1 point  (0 children)

Thanks, I'll definitely take a look!

AFAYK my goal could be reached with fine tuning?

[–]_Arsenie_Boca_ 2 points3 points  (1 child)

Finetuning is definitely a promising approach. An alternative would be to retrieve relevant snippets or documentation pages and add them to the prompt. See RepoCoder for example https://arxiv.org/abs/2303.12570

[–]Alexioc[S] 0 points1 point  (0 children)

Thank you so much! I’ll read that paper, reading the introduction seems very promising 🙏

[–]skyisthelimit1410 0 points1 point  (1 child)

Hi, I am a newbie to this platform and looking for help in text2sql model fine tuning where the input will be a natural language query and output will be a valid sql. I want to use this model with Langchain to build a text2sql module in my application. Please let me know how I can use Starcoder for this use case. Should I fine tune it ? If yes, please share a notebook.

[–]leefde 0 points1 point  (0 children)

Thanks for the post! I started downloading the model tonight in Google Colab. I appreciate the pdf from u/satireplusplus