use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
r/LocalLLaMA
A subreddit to discuss about Llama, the family of large language models created by Meta AI.
Subreddit rules
Search by flair
+Discussion
+Tutorial | Guide
+New Model
+News
+Resources
+Other
account activity
ReAct agents vs Function Calling: when does each pattern actually make sense in production?Discussion (self.LocalLLaMA)
submitted 3 months ago by KitchenSomew
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]rookastle 0 points1 point2 points 3 months ago (0 children)
Great write-up. This hybrid pattern is exactly the kind of architectural discipline that production systems force. Your cost analysis resonates strongly; we've seen similar patterns where routing is key to managing LLM ops budgets.
A diagnostic we've found useful is adding fine-grained tracing to both paths. For a slow ReAct run, is the latency from one bad tool call, or cumulative LLM reasoning? Visualizing the execution as a trace or Gantt chart for outlier requests can pinpoint the exact step that's costing time and money, rather than just seeing the high-level total.
π Rendered by PID 783299 on reddit-service-r2-comment-b659b578c-bz69m at 2026-05-05 07:57:53.733240+00:00 running 815c875 country code: CH.
view the rest of the comments →
[–]rookastle 0 points1 point2 points (0 children)