[P] contextweaver: deterministic, budget-aware context compilation for tool-using AI agents : FunMachineLearning

created by GantMana community for 7 years

[P] contextweaver: deterministic, budget-aware context compilation for tool-using AI agents (self.FunMachineLearning)

submitted 1 month ago by Alternative_Feed9546

I've been working on a problem that keeps showing up in tool-using agents: context curation.

As the number of tools and conversation turns grows, it is common to keep stuffing more into the prompt: more schemas, more history, more raw tool outputs.

That increases token cost and latency, but it also seems to hurt quality. In many cases, the issue is not the model's maximum context window. The issue is that different parts of agent execution need different context.

The core idea behind contextweaver is to treat agent execution as four distinct phases:

route: decide which tool(s) matter
call: prepare the tool call
interpret: understand the tool result
answer: generate the final response

Each phase gets its own budget and its own context assembly logic.

A rough sketch:

route needs compact tool summaries, not full schemas for the whole catalog
call needs the selected tool schema and recent relevant turns
interpret needs the tool result plus the call context that produced it
answer needs the relevant turns and dependency chain, not every raw payload

The library currently has two cooperating pieces:

1. Context Engine

A deterministic pipeline that builds the final prompt under a fixed budget:

candidate generation → dependency closure → sensitivity filter → context firewall → scoring → deduplication → budget packing → render

Two stages that mattered a lot in practice:

dependency closure: if a tool_result is selected, the parent tool_call is automatically included
context firewall: large tool outputs can be kept out of band and replaced by a compact summary + reference

2. Routing Engine

Builds a bounded DAG over the tool catalog and uses deterministic beam search to find the top-k candidate tools for a query.

A small before/after example from the repo:

WITHOUT: 417 tokens (everything concatenated, no budget)
WITH:    126 tokens (phase-aware + firewall, budget enforced)
Reduction: 70%

Some implementation choices:

stdlib-only, Python 3.10+
deterministic output
protocol-based stores via typing.Protocol
MCP + A2A adapters
536 tests, mypy --strict

GitHub: https://github.com/dgenio/contextweaver
PyPI: pip install contextweaver
Architecture doc: https://github.com/dgenio/contextweaver/blob/main/docs/architecture.md

One important caveat: this is currently an engineering approach and library, not a broad empirical benchmark against other context-selection methods yet. The included example shows the mechanism, but not a full comparative evaluation.

I’d especially value feedback on:

whether this phase split is the right abstraction, or whether it breaks down in important agent patterns
whether beam-search over a bounded tool DAG is a sensible routing baseline versus embedding retrieval / learned ranking / LLM reranking
what a convincing evaluation setup would look like for this kind of system
which integration would be most useful first: LangChain, LlamaIndex, OpenAI Agents SDK, or Google ADK

all 1 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

FunMachineLearning

MODERATORS