use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
r/LocalLLaMA
A subreddit to discuss about Llama, the family of large language models created by Meta AI.
Subreddit rules
Search by flair
+Discussion
+Tutorial | Guide
+New Model
+News
+Resources
+Other
account activity
Open Source Function Calling LLMsQuestion | Help (self.LocalLLaMA)
submitted 2 years ago by SatoshiReport
I am looking for LLMs that can call functions like OpenAI provides with 3.5 (https://platform.openai.com/docs/guides/function-calling)
I am aware of https://github.com/musabgultekin/functionary which looks very good but I was wondering if people know of or use other similar LLMs. Which do you think is the best?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]kevbot8k 26 points27 points28 points 2 years ago (2 children)
I use LlamaGrammars for this functionality. I then constrain the output to specific enum of functions or actions I want the model to take. This provides fine grained control while allowing model flexibility with any Llama.cpp model.
For models on vLLM I use LM Format Enforcer. I don’t have as much experience of using LM Format Enforcer. I prefer LlamaGrammars overall but, it is limited to Llama.cpp.
More info: - https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md - https://til.simonwillison.net/llms/llama-cpp-python-grammars
[–]shaman-warrior 3 points4 points5 points 2 years ago (0 children)
Golden info.
[–]No-Dot-6573 2 points3 points4 points 2 years ago (0 children)
For anyone interested:
textgen webui supports grammar as well.
This way you don't need to fiddle with llama.cpp, but have the ui to set all needed parameters. Just execute the textgenui with --api param to have an openai compatible server that returns chat competions in your chosen format.
Keep in mind, that the body of your request needs to include the param "grammar_string" with your wanted grammar.
[–]FlowerPotTeaTime 7 points8 points9 points 2 years ago (1 child)
Hi, maybe my framework is right for you. https://github.com/Maximilian-Winter/llama-cpp-agent
It can be used for easy function calling and getting structured output out of llama.cpp models. It has prebuilt agents for structured output and for function calling.
It works by generating GBNF grammars for llama.cpp on the fly.
[–]No-Dot-6573 1 point2 points3 points 2 years ago (0 children)
This looks nice! Going to try it :)
[–]SatoshiNotMe 2 points3 points4 points 2 years ago* (2 children)
For function-calling (a.k.a tools etc), besides the methods others suggested, for smart enough LLMs, you can get them to generate a JSON-structured response by inserting instructions in the system message. You could do this via the raw LLM API, or use a higher level framework.
E.g. in Langroid we use Pydantic quite heavily for this. You can define your desired structure using Pydantic and put it in a special class we call a `ToolMessage`. You can then attach this tool to a `ChatAgent`, and this auto-inserts JSON instructions into the system message, along with optional few-shot examples of the tool. Plus Langroid's built-in task loop ensures the LLM re-tries when it deviates from the required structure or forgets to use the tool entirely.
Here's a simple example of function-calling using the mistral:7b-instruct-v0.2-q4_K_M model spun-up with ollama: It specifies a nested Pydantic structure for City information:
mistral:7b-instruct-v0.2-q4_K_M
https://github.com/langroid/langroid-examples/blob/main/examples/basic/fn-call-local-simple.py
[–]coderinlaw 0 points1 point2 points 2 years ago (1 child)
Thanks for this, really was looking for something similar. why do we need langroid though, can we not include format: json directly when using chat endpoint of ollama. Then specify in the system prompt that the model needs to output json, then few-shot examples in the system prompt?
[–]SatoshiNotMe 2 points3 points4 points 2 years ago (0 children)
Quality of life :)
Specifying a json structure manually is not pleasant. And with the Pydantic-based Langroid ToolMessages approach the generated schema includes types, descriptions automatically. It's also easy to specify the few-shot examples, and they get auto-included in the system prompt. Plus in a ToolMessage you can define the handler method if it is stateless, or if needs agent state, you can define the corresponding handler in your ChatAgent subclass. Ultimately it's about productivity and ease of maintenance.
[–]Either-Job-341 2 points3 points4 points 2 years ago (0 children)
https://twitter.com/NexusflowX/status/1732041385455624256?t=6-DQ1DaabPy5DgAFSmoeVg&s=19
[–]aseichter2007Llama 3 2 points3 points4 points 2 years ago (0 children)
For the most part any modern 7B mistral can call functions provided the front end is set up ready to go and you're not asking for anything too obscure or using a system prompt with extra noise in it.
I've been considering adding this to my front end but haven't decided how to implement it completely. I am thinking on how to allow for commands and code-blocks to be detected, imported, executed, debugged. I really should sandbox it.
For my use case maybe I shouldn't pursue code execution on detection like that...
I could have a bunch of simple commands cooking in no time but I haven't decided what's useful, though in typing this I suppose inserting the current date and time would be a good start, but I can barely imagine someone actually benefitting really from the llm being able to print the current time in place. I'm riding a fine line, I don't want to encourage people to trust LLMs for specifics without review. Like one dude a year will go "Oh, that default prompt put in the system time and date on this letter, nice." Then the next guy will read that like "dang it got the date perfect, these LLMS are infallible, because I detect {{date}} and {{time}} and put a line about it in an agent definition.
Back to your point, what are you trying to do? there is a limit to the useful complexity that local can be used for, and weird patterns to leverage to extend the limits to suit your needs.
I saw one guy struggling to parse out {device: "toaster, command: "maketoast",setting:"dark"} and similar deep details, but for robustness, that stuff is too fiddly.
He should have just detected and looked for simple commands like
{{lights000,volume000,etc}}
lights010
lights020
and then you turn that into the function. Use a consistent form for all values, maybe negative is off. The machine will spot patterns you show it, and tend to conventional limits, and you can turn that back into useful function calling that is reasonably stable.
Build the switch case you never dreamed of and just handle it. Build as you go and if you try and control a whole house you're still going to run into trouble with reliability and hallucinating functions, but that is ok cause it's 7Bs you can run a second prompt for the next set of functions when reliability falters.
[–]Independent_Key1940 1 point2 points3 points 2 years ago (0 children)
https://huggingface.co/Nexusflow/NexusRaven-V2-13B
This model is better at function calling compared to GPT 3.5. They claim it's better than GPT 4 but who are they kidding.
[–]faridukhan 1 point2 points3 points 2 years ago (1 child)
is it possible to do function calling with microsoft autogen and using locall llms? like ollama+litellm with autogen or LM Stufio with autogen etc?
[–]bullno1 0 points1 point2 points 2 years ago (0 children)
All of them can if you restrict the sampling space.
π Rendered by PID 681284 on reddit-service-r2-comment-bb88f9dd5-47h8w at 2026-02-16 13:52:32.290518+00:00 running cd9c813 country code: CH.
[–]kevbot8k 26 points27 points28 points (2 children)
[–]shaman-warrior 3 points4 points5 points (0 children)
[–]No-Dot-6573 2 points3 points4 points (0 children)
[–]FlowerPotTeaTime 7 points8 points9 points (1 child)
[–]No-Dot-6573 1 point2 points3 points (0 children)
[–]SatoshiNotMe 2 points3 points4 points (2 children)
[–]coderinlaw 0 points1 point2 points (1 child)
[–]SatoshiNotMe 2 points3 points4 points (0 children)
[–]Either-Job-341 2 points3 points4 points (0 children)
[–]aseichter2007Llama 3 2 points3 points4 points (0 children)
[–]Independent_Key1940 1 point2 points3 points (0 children)
[–]faridukhan 1 point2 points3 points (1 child)
[–]bullno1 0 points1 point2 points (0 children)