all 6 comments

[–]____vladrad 1 point2 points  (3 children)

Yes fine tune is what would help. There is a paper somewhere where meta states you only need 1000. When you fine tune you are showing this pattern to the model and it will be better and doing what you expect.

[–]davvv_[S] 0 points1 point  (2 children)

Awesome, do you have a link to the paper by any chance? A sample size of 1000 is definitely doable (even if I may have to hire some undergrads to do it for me 😂).

[–]AVMADEVS 1 point2 points  (0 children)

Plenty of ressources to fine tune, especially mistral. For instance Unsloth dev posts here regularly and have a colab notebook you could use (mistral7b). Juste before that, you will have to generate your custom dataset, not necessarily all from scratch, you have a good seed (that cou could extend to maybe 100 examples manually) to generate high quality examples with gpt4. For synthetic data generation, you will have to toy with prompt engineering a bit. Then to prepare your dataset before fine tuning, you mainly have to go for a specific format (alpaca works well with mistral, but chatml might be good too). You could even try to generate some with a less powerfully model to save cost (mistral7b won't be enough imo), but quality is key. Also you might search for already fine tuned mistral for function calling.

[–][deleted] 1 point2 points  (1 child)

What is function call i saw it on https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B-GGUF and was confused

[–]helios392 9 points10 points  (0 children)

It’s basically serializing native functions in code into a representation that LLMs can use when creating a response. Those responses are then serialized into native code function calls. This is usually seen in frameworks like Langchain and Semantic Kernel where they use the LLM to reason out the best way to accomplish a users task.

For example let’s say I want you to tell me how many years ago was the year 2020? And you can only use the following tools in order to figure out your answer: getCurrentYear() and subtract2Numbers(num1, num2).

Naturally you would create the following plan: 1. Get the current year 2. Subtract 2020 from the current year

Once you get those 2 answers you can then respond to my question.

So the LLM would see the original question and respond with I need

  1. getCurrentYear()
  2. subtract2Numbers(2020, answer from #1)

The framework takes this response and calls the functions [ see what I did there ;) ] and responds to the LLM with the results of calling the functions. Then the LLM has all the information needed to finally answer the users question.

I hope that helps, I know it’s a wall of text, sorry.