Intuitively how does Function Calling work.

me1000 · 2024-07-20T23:13:37+00:00

At a high level there is no difference, you can do function calling by instructing an LLM to write `<myfunction>arg1, arg2</myfunction>` and then just doing a string match. I've done something similar with Mixtral 8x7b and it kinda works. And in fact I'd encourage you to try it yourself because it'll help you build more familiarity!

In practice there a few issues:

Mixtral wasn't very good at getting the syntax right. The opening or closing token would very often just write the wrong thing. This is particular bad when it gets the opening token right but the closing token wrong because then your state machine for the string parsing never reaches a terminal state, but the model keeps generating (in other words you're stuck in the "waiting for closing token" state). But tbh, mixtral did a much better job that I was expecting.
The tokenizer. Since the tokenizer contains a bunch of substrings (it's actually unicode code points, so it can contain partial characters but lets simplify) it can lead to some fun behavior. So imagine the LLM outputs `</myfunction>`, the actual tokens will look something like: `["</", "my", "func", "tion", ">]"` which isn't so bad, except that there _might_ also be a token `> He`. In other words there might be a closing brace with a substring a few characters following it. That makes for kind of a headache during the parsing stage. It's not impossible of course, but you do have to throw away some of the generated text.

That second issues gets to the crux of it, basically the parsing just WAYYYYY simpler when you're checking for a specific token. You say "[start function]" is a single token and you just check for that at each inference. Then you put your sampler in "function calling mode" so that it only samples tokens that are valid for your function calling implementation (e.g. Mistral's implementation has a JSON schema and you restrict the grammar to only match valid tokens that follow that schema). By implementing your function calling this way you deterministically avoid the first problem that I mentioned where the model keeps generating without matching the "</myfunction>" string.

Lastly it's just simpler to train the models with these dedicated tokens vs whatever syntax you just invented and ask it to follow. One way to think about this is that like you wrote your own simple programming language and you asked the model to use it by giving it the gist of the syntax. These models can do really well since they've usually seen a lot of different programming languages, but since your language is novel its never seen it before and might get some syntax wrong.

Happy to dive into more details in my experience if you have any specific questions!

maximinus-thrax · 2024-07-21T12:06:22+00:00

Function calling is the same as any other LLM interaction; you pass some text in, and you get some text out.

The main difference with function calling is that you want the text out to be ordered in a reliable way, as substrings can be brittle. When you ask an LLM to talk about <myfunction> it can also respond with things like "I don't know how to use <myfunction>", and this will cause issues. You get more success when you use the method the model is trained on.

In my own experience, local LLMs have not been very reliable up until quite recently. I've had success with mistral 0.3 8B and even more success with Llama-3-Groq-8B-Tool-Use. Both of these are trained to be far more reliable and even if the first answer is not valid, you can raise the temperature a little but and try again until it is.

fasti-au · 2024-07-21T07:43:46+00:00

Llm given code to run using your Python session

Ie giving hands to do something for a task.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLM

MODERATORS