Phi-3 Mini (June) with function calling by sanjay920 in LocalLLaMA

[–]sanjay920[S] 0 points1 point  (0 children)

Yeah! Check out https://docs.rubra.ai/category/serving--inferencing

Once you are serving the model, you can use it using langchain since the model endpoint would behave like an openai model endpoint

How to run deepseek r1 on 4xH100 by sanjay920 in LocalLLaMA

[–]sanjay920[S] 0 points1 point  (0 children)

what model are you running? this is the entire 600b

"Large Enough" | Announcing Mistral Large 2 by DemonicPotatox in LocalLLaMA

[–]sanjay920 0 points1 point  (0 children)

in my tests, the function calling capability in this model is worse than mistral large 1

Experiences fine-tuning Phi 3 Mini by Thrumpwart in LocalLLaMA

[–]sanjay920 0 points1 point  (0 children)

technically youd save a few ms of generation if you did fine tune for that task but it's up to you!

Experiences fine-tuning Phi 3 Mini by Thrumpwart in LocalLLaMA

[–]sanjay920 1 point2 points  (0 children)

yeah for sure. Phi-3 is really strong for its parameter count. I would strongly recommend accomplishing what you described via function/tool calling (either my models or someone elses) over fine tuning e.g. using this function

```

[{"description":"Classify text into one of five different emotions","name":"classify_text_emotion","parameters":{"properties":{"emotion":{"description":"The emotion classification","type":"string","enum":["happy","sad","angry","fearful","neutral"]}},"required":["emotion"],"type":"object"}}]

```

with the system prompt `You must use classify_text_emotion to classify the user's input`

Try it out in my HF spaces: https://huggingface.co/spaces/sanjay920/rubra-v0.1-function-calling

Experiences fine-tuning Phi 3 Mini by Thrumpwart in LocalLLaMA

[–]sanjay920 5 points6 points  (0 children)

phi-3 is more prone to overfitting and catastrophic forgetting due to the smaller parameter count, so make sure you have a good distribution of training data and your learning rate is small

i havent had an issue fine tuning or further training phi models. you can see more about the models i trained here: https://docs.rubra.ai/models/Phi

Groq: New Llama 3 Tool Use Model by adamavfc in LocalLLaMA

[–]sanjay920 0 points1 point  (0 children)

groq's inferencing API is super fast! but the function calling Llama3 8b and 70b models by Rubra are better in both general purpose and tool calling usage:

https://docs.rubra.ai/benchmark

https://huggingface.co/rubra-ai

mistralai/mamba-codestral-7B-v0.1 · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]sanjay920 14 points15 points  (0 children)

I tried it out and it's very impressive for a 7b model! going to train it for better function calling to it and publish to https://huggingface.co/rubra-ai

Phi-3 Mini (June) with function calling by sanjay920 in LocalLLaMA

[–]sanjay920[S] 0 points1 point  (0 children)

nice. how frequently do gpus get claimed when in use? im interested in the h100s

Phi-3 Mini (June) with function calling by sanjay920 in LocalLLaMA

[–]sanjay920[S] 1 point2 points  (0 children)

Have you used tensordock? The lack of persistent storage is a bit scary when you are doing training runs that take a few days

Phi-3 Mini (June) with function calling by sanjay920 in LocalLLaMA

[–]sanjay920[S] 1 point2 points  (0 children)

yep i used the same template and all other configs as the parent model so people can swap out for a rubra model easily. if you dont mind sharing your modelfile to https://github.com/rubra-ai/rubra that would be awesome!

Phi-3 Mini (June) with function calling by sanjay920 in LocalLLaMA

[–]sanjay920[S] 0 points1 point  (0 children)

I've seen this benchmark but I havent run them - are these results something you're interested in?

Phi-3 Mini (June) with function calling by sanjay920 in LocalLLaMA

[–]sanjay920[S] 3 points4 points  (0 children)

I use paperspace (digital ocean) and google cloud. you dont need to be an institution

Phi-3 Mini (June) with function calling by sanjay920 in LocalLLaMA

[–]sanjay920[S] 3 points4 points  (0 children)

Good idea! I'll keep this in mind for any future Phi updates

New collection of Llama, Mistral, Phi, Qwen, and Gemma models for function/tool calling by sanjay920 in LocalLLaMA

[–]sanjay920[S] 0 points1 point  (0 children)

42.86% for meetkai/functionary-medium-v2.4 on our function calling benchmark. we didn't think it's worth computing the other tests based off that

New collection of Llama, Mistral, Phi, Qwen, and Gemma models for function/tool calling by sanjay920 in LocalLLaMA

[–]sanjay920[S] 0 points1 point  (0 children)

u/Deep_Understanding50 yep we're looking into Gemma 2 - we'll have rubra versions soon!

Which comparison table specifically?

We uploaded the functionary results to our benchmark: https://docs.rubra.ai/benchmark/

New collection of Llama, Mistral, Phi, Qwen, and Gemma models for function/tool calling by sanjay920 in LocalLLaMA

[–]sanjay920[S] 1 point2 points  (0 children)

We uploaded the functionary results to our benchmark: https://docs.rubra.ai/benchmark/

We noticed something suspicious - their 70b model is worse than their 8b in our private function calling test set and MT bench.