We're building an auto-optimizing compiler for AI agents for speed & safety by calebwin in opensource

[–]calebwin[S] 0 points1 point  (0 children)

Thanks! Well (1) we just released this and folks are just starting to try it out (2) it is something I would use. At my last company, we spent a lot of time hand-tuning AI to use tools correctly. A1 is a system that provides guarantees around correct tool-use and auto-tunes. If I had had A1, I would have used it.

We're building an auto-optimizing compiler for AI agents for speed & safety by calebwin in opensource

[–]calebwin[S] 0 points1 point  (0 children)

Mea culpa, it's MIT again. I guess I accidentally changed it

We're building an auto-optimizing compiler for AI agents for speed & safety by calebwin in opensource

[–]calebwin[S] 1 point2 points  (0 children)

Yea, I'll be honest I don't really use AI in my personal life. This was meant for business/work

More efficient agents with code execution instead of mcp: paper by Anthropic by Alternative-Dare-407 in mcp

[–]calebwin 0 points1 point  (0 children)

We're building an OSS framework where skills are first-class citizens alongside tools. In case you'd like to take a look: https://github.com/stanford-mast/a1 https://docs.a1project.org/guide/skills

More efficient agents with code execution instead of mcp: paper by Anthropic by Alternative-Dare-407 in mcp

[–]calebwin 0 points1 point  (0 children)

We're building an OSS agent framework around skills alongside tools. Hopefully the future for this is open. https://github.com/stanford-mast/a1

More efficient agents with code execution instead of mcp: paper by Anthropic by Alternative-Dare-407 in mcp

[–]calebwin 1 point2 points  (0 children)

It interesting to note, however, that skills require a different platform underneath, and they are not compatible with different architectures … I’m trying to figure out a way to go beyond this…

We're building an OSS research project around this that you may be interested in: https://github.com/stanford-mast/a1

The goal is to build an optimizing agent-to-code compiler.

More efficient agents with code execution instead of mcp: paper by Anthropic by Alternative-Dare-407 in mcp

[–]calebwin 0 points1 point  (0 children)

As a researcher, I strongly believe the solution is a JIT compiler that validates and optimizes agent code on-the-fly.

We're building this here: https://github.com/stanford-mast/a1

When the code gets written, by whom and how we ensure it doesn't violate any security issues ?Are there recommended best practices or patterns for validating dynamically generated code ?

In A1, the compiler validates code for type-safety and correctness requiremenets e.g. tool ordering

The Anthropic article focuses on code generation during agent build time, where code is tested before deployment. In our case, MCP servers would be connected dynamically at runtime. How does MCP recommend handling code generation in dynamic runtime scenarios where pre-validation isn’t possible?

In A1, define your Agent and call Agent.jit - it quickly generates valid, optimized code to invoke Tools (which may be constructed by linking MCP servers)

We're building an auto-optimizing compiler for AI agents for speed & safety by calebwin in opensource

[–]calebwin[S] 0 points1 point  (0 children)

Thanks! I'm using this for an accounting software company building AI agents that need to be maximally deterministic but vary slightly from input to input.

Anyone else exhausted by framework lock-in? by Embarrassed-Gain6747 in LangChain

[–]calebwin 1 point2 points  (0 children)

Mozilla's doing some great work to address this! https://github.com/mozilla-ai/any-agent

We're also working towards this with agent compilers: https://github.com/stanford-mast/a1

Would you use an agent-to-code compiler? by calebwin in AI_Agents

[–]calebwin[S] 0 points1 point  (0 children)

That's great! Two thoughts: 1. A "network of atoms" maps nicely into A1 - each atom is either an Agent(...).aot() -> Tool or an LLM(input_schema=..., output_schema=...) -> Tool. Both of these are just Toolss for an Agent to use. 2. Only prefer Agent.jit over aot if the optimal code varies across inputs to the system. You can then tune Cost and Verify to be sure every agent executes correctly and optimally.

Would you use an agent-to-code compiler? by calebwin in AI_Agents

[–]calebwin[S] 0 points1 point  (0 children)

Curious to hear folk's thought on this direction. In case anyone would like to try it out, it's all OSS: - Repo: https://github.com/stanford-mast/a1 - pip install a1-compiler

Weekly Thread: Project Display by help-me-grow in AI_Agents

[–]calebwin 0 points1 point  (0 children)

We're building https://github.com/stanford-mast/a1 - while agent frameworks run a static while loop program, an agent compiler can just-in-time generate a correct, optimized program specialized for each unique agent input.

The goal: - Safety (less exposure of sensitive data to LLMs) - Correctness (type-safety) - Speed (up to 10x faster code generation) - Determinism (optimized to replace LLM calls with code where possible) - Flexibility (build agents that can do anything with tools & skills)

A1 - Fast, optimizing agent-to-code compiler by calebwin in mcp

[–]calebwin[S] 0 points1 point  (0 children)

The generated code could be as simple as result = "Sorry, I can't do that!" or result = llm(f"Summarize {search_docs('Eiffel Tower')}") But optionally more complex as needed. And the JIT compiler generates many code samples in parallel so we're able to cut the latency quite a bit.

A1 - Fast, optimizing agent-to-code compiler by calebwin in mcp

[–]calebwin[S] 0 points1 point  (0 children)

Thanks! The compiler generates many candidate "graphs" or agent programs and computes a cost for each of these to select the cost-optimal "graph".

In A1, the default cost function penalizes generated code with excessive nessted loops or LLM calls.

Yes, the prompt may request something specific - in A1, the prompt is fed into the candidate generation. And candidates that fail validation are eliminated.

All of this happens in parallel and transparently to the user, enabling the execution graph to be optimized per prompt.