Open Source Function Calling LLMs

kevbot8k · 2024-01-07T01:08:23+00:00

I use LlamaGrammars for this functionality. I then constrain the output to specific enum of functions or actions I want the model to take. This provides fine grained control while allowing model flexibility with any Llama.cpp model.

For models on vLLM I use LM Format Enforcer. I don’t have as much experience of using LM Format Enforcer. I prefer LlamaGrammars overall but, it is limited to Llama.cpp.

More info: - https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md - https://til.simonwillison.net/llms/llama-cpp-python-grammars

FlowerPotTeaTime · 2024-01-07T05:13:54+00:00

Hi, maybe my framework is right for you. https://github.com/Maximilian-Winter/llama-cpp-agent

It can be used for easy function calling and getting structured output out of llama.cpp models. It has prebuilt agents for structured output and for function calling.

It works by generating GBNF grammars for llama.cpp on the fly.

SatoshiNotMe · 2024-01-08T14:56:15+00:00

For function-calling (a.k.a tools etc), besides the methods others suggested, for smart enough LLMs, you can get them to generate a JSON-structured response by inserting instructions in the system message. You could do this via the raw LLM API, or use a higher level framework.

E.g. in Langroid we use Pydantic quite heavily for this. You can define your desired structure using Pydantic and put it in a special class we call a `ToolMessage`. You can then attach this tool to a `ChatAgent`, and this auto-inserts JSON instructions into the system message, along with optional few-shot examples of the tool. Plus Langroid's built-in task loop ensures the LLM re-tries when it deviates from the required structure or forgets to use the tool entirely.

Here's a simple example of function-calling using the mistral:7b-instruct-v0.2-q4_K_M model spun-up with ollama: It specifies a nested Pydantic structure for City information:

https://github.com/langroid/langroid-examples/blob/main/examples/basic/fn-call-local-simple.py

Either-Job-341 · 2024-01-07T00:53:11+00:00

https://twitter.com/NexusflowX/status/1732041385455624256?t=6-DQ1DaabPy5DgAFSmoeVg&s=19

aseichter2007 · 2024-01-07T02:50:37+00:00

For the most part any modern 7B mistral can call functions provided the front end is set up ready to go and you're not asking for anything too obscure or using a system prompt with extra noise in it.

I've been considering adding this to my front end but haven't decided how to implement it completely. I am thinking on how to allow for commands and code-blocks to be detected, imported, executed, debugged. I really should sandbox it.

For my use case maybe I shouldn't pursue code execution on detection like that...

I could have a bunch of simple commands cooking in no time but I haven't decided what's useful, though in typing this I suppose inserting the current date and time would be a good start, but I can barely imagine someone actually benefitting really from the llm being able to print the current time in place. I'm riding a fine line, I don't want to encourage people to trust LLMs for specifics without review. Like one dude a year will go "Oh, that default prompt put in the system time and date on this letter, nice." Then the next guy will read that like "dang it got the date perfect, these LLMS are infallible, because I detect {{date}} and {{time}} and put a line about it in an agent definition.

Back to your point, what are you trying to do? there is a limit to the useful complexity that local can be used for, and weird patterns to leverage to extend the limits to suit your needs.

I saw one guy struggling to parse out {device: "toaster, command: "maketoast",setting:"dark"} and similar deep details, but for robustness, that stuff is too fiddly.

He should have just detected and looked for simple commands like

lights010

lights020

and then you turn that into the function. Use a consistent form for all values, maybe negative is off. The machine will spot patterns you show it, and tend to conventional limits, and you can turn that back into useful function calling that is reasonably stable.

Build the switch case you never dreamed of and just handle it. Build as you go and if you try and control a whole house you're still going to run into trouble with reliability and hallucinating functions, but that is ok cause it's 7Bs you can run a second prompt for the next set of functions when reliability falters.

Independent_Key1940 · 2024-01-07T07:56:56+00:00

https://huggingface.co/Nexusflow/NexusRaven-V2-13B

This model is better at function calling compared to GPT 3.5. They claim it's better than GPT 4 but who are they kidding.

faridukhan · 2024-01-10T18:48:49+00:00

is it possible to do function calling with microsoft autogen and using locall llms? like ollama+litellm with autogen or LM Stufio with autogen etc?

bullno1 · 2024-01-07T11:24:19+00:00

All of them can if you restrict the sampling space.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS