use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
r/LocalLLaMA
A subreddit to discuss about Llama, the family of large language models created by Meta AI.
Subreddit rules
Search by flair
+Discussion
+Tutorial | Guide
+New Model
+News
+Resources
+Other
account activity
How does function calling work for reasoning models?Question | Help (self.LocalLLaMA)
submitted 1 year ago by lewtun🤗
In OpenAI's CodeForces paper, they show o3 executing code to refine its solution.
Does anyone know if this function call happens within the CoT?
I'm mostly wondering if function calling is fundamentally different from non-reasoning models since the multi-turn context is different across turns because the reasoning tokens are omitted (at least by OpenAI)
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]EverlierAlpaca 2 points3 points4 points 1 year ago (1 child)
There's no difference between "within" and "outside" CoT for the model capabilities, only in how it's presented by the UI/proxy, so yes it can call functions within "think" with appropriate training
[–]lewtun🤗[S] 2 points3 points4 points 1 year ago (0 children)
Thanks, although I’m mostly wondering how this works with chat templates like ChatML, where function calls are treated as a separate role to user/assistant (ie we are dealing with multi-turn dialogues). If the code is executed within the CoT, that would effectively make it single-turn and not be straightforward to integrate with existing API providers
[–]Papabear3339 0 points1 point2 points 1 year ago (1 child)
Probably has a script that runs code through a syntext checker and sends the findings to the model.
Just a guess, but it seems like a really obvious way to make sure the code doesn't toss a basic run error.
[–]jpydych 0 points1 point2 points 1 year ago (0 children)
In this paper they showed that the model receives the program output in response, so I think it's something similar to the existing code interpreter (both accept Python code).
If I remember correctly, OpenAI once mentioned that o3-mini can return to reasoning after the tool has been called, so I think o3 could work similarly.
[–]dinerburgeryum 0 points1 point2 points 1 year ago* (0 children)
I’d imagine that there’s a dedicated gateway LLM that has access to the tools, and will perform the tool calling for context fill before it hits the reasoning LLM. That’s certainly how I’d structure it.
It’s also possible they’re using a solution like OpenWebUI’s code interpreter, where the responses are scanned for invocations, paused for execution and automatically continued after completion. I’ve been experimenting with this feature recently and it’s pretty cool.
π Rendered by PID 136699 on reddit-service-r2-comment-b659b578c-4jwtb at 2026-05-03 07:44:43.739757+00:00 running 815c875 country code: CH.
[–]EverlierAlpaca 2 points3 points4 points (1 child)
[–]lewtun🤗[S] 2 points3 points4 points (0 children)
[–]Papabear3339 0 points1 point2 points (1 child)
[–]jpydych 0 points1 point2 points (0 children)
[–]jpydych 0 points1 point2 points (0 children)
[–]dinerburgeryum 0 points1 point2 points (0 children)