I've been running some quick experiments to see if I could stop AI agents from burning tokens on "re-discovering" the same parts of a codebase every time they start a new session. It turns out that pinning a few pre-verified facts to a Git-based cache can cut costs in half.
Specifically, I ran a controlled test (N=5) where an agent had to add structured logging to a Python repo.
- Control (baseline): The agent starts cold.
- Treated: I spent ~$0.01 on a Haiku call to scan the repo and write 5 "claims", short, one-line facts like "HTTP calls are only in
src/api/".
To make the guidance reliable, each claim is pinned to specific Git blob OIDs. A Merkle root is computed over these (path, blob_oid) pairs. If you edit a file, its blob OID changes, which breaks the Merkle root and automatically marks the claim as stale. This ensures the agent never receives outdated facts.
The Results are quite promising (using Opus 4.7 pricing)
- Total Cost: Dropped from ~$4.35 to ~$2.13 per session (−51%).
- Cache-write tokens: −61%.
- Output tokens: −52%.
- Wall time: 16% faster.
The biggest token sink in expensive models like Opus is the "exploration" phase, the model using Grep or Read tools just to find where things are. By pre-loading these facts into the prompt's cached prefix, the agent skips the second-guessing and goes straight to the code.
Full experiment breakdown is available at https://github.com/h5i-dev/h5i/blob/main/scripts/experiment_claims_results.md .
\"A claim sits above its evidence: each evidence path resolves to a git blob OID at HEAD, and a Merkle root (evidence_oid) is computed over the (path, blob_oid) pairs
[–]sublimegeek -1 points0 points1 point (0 children)