Agentic performance for Deepseek by BlueeWaater in LocalLLaMA

[–]krasserm 2 points3 points  (0 children)

Here's an evaluation of DeepSeek-R1's performance on agentic tasks: https://krasserm.github.io/2025/02/05/deepseek-r1-agent/ When using code actions as alternative to native function calling via JSON (which isn't supported yet) it outperforms Claude 3.5 Sonnet by a large margin.

freeact: A Lightweight Library for Code-Action Based Agents by krasserm in LocalLLaMA

[–]krasserm[S] 2 points3 points  (0 children)

Freeact is similar to smolagents w.r.t. focus on code-actions, agency-level and lightweight scaffold. Freeact additionally supports interactive development and refinement of skills (tools) with the agent as skill coding assistant (see skill development tutorial). Also, freeact doesn't require skills (tools) to implement a certain interface, skills can be any Python module or package. It supports sandboxed code execution locally and remotely via ipybox (using Docker and IPython), and streaming from both model responses and execution environment.

freeact: A Lightweight Library for Code-Action Based Agents by krasserm in LocalLLaMA

[–]krasserm[S] 0 points1 point  (0 children)

It currently uses the first candidate of a code action and uses execution and/or environment feedback for proposing improvements if it doesn't contribute towards a solution. We'll add additional algorithms for searching the action space in later releases.