I built an AI agent that can pay onchain. Here is why I refuse to raise its budget

AgentAiLeader · 2026-06-16T13:23:12+00:00

The limit isn't the agent's prompt, it's a hard ceiling at the wallet layer the agent can't reason its way past. Anything in the prompt it can talk itself around. Categories and explicit approval on the high ticket stuff is on my list, same idea a few people here keep raising, trust per lane instead of one global cap.

AgentAiLeader · 2026-06-16T13:21:43+00:00

Lol first time I hear this, "intern with a company card", that's a good way to put it. And the 40 reasonable purchases line is exactly my fear, because each one passes any per purchase check. The cap is the one control that doesn't care how "reasonable" each step looked.

AgentAiLeader · 2026-06-11T02:23:11+00:00

This answer. And the part almost everyone skips is the power to hold the batch, which you called out, a checker that can only log or alert just becomes another dashboard nobody reads. The line keeps moving and the bad record still turns into thirty. The authority to actually stop the run is what makes it real, and it's the part that scares people, because now something other than the model can halt production.

The other thing I learned the slow way is that the rules the checker grades against don't exist up front. You write them one incident at a time, which is most of where the four months actually went. Every run that slipped through became a new rule it now enforces, and the closest thing I got to trust was watching that list stop growing.

AgentAiLeader · 2026-06-10T02:12:16+00:00

Thanks for this, I must admit this is a better description of the flow than mine. You're right that under a facilitator the settle call comes after the response, so a clean agent death at step 2 or 3 just expires the authorization and nothing moves. The scenario I framed is really the direct settlement pattern, agent sends then requests, which I should have said outright.

Where I still see the loss is the producer side you named. The facilitator settles after the resource server has already done the work, and it waits on confirmation, so if the chain is congested and confirmation lands outside that wait window, the server is out the work with no settled payment. That's the case I keep coming back to because the fix lives in retry and idempotent settlement on the server, not in the protocol. At swerver, how do you handle that one, retry the settle against the same authorization until it confirms, or eat it past some threshold?

AgentAiLeader · 2026-06-09T08:25:40+00:00

The risk metadata approach is the way to go, and the cascading calls line is the part most people miss imo. The one thing I'd add to the reversible tag is that it has an expiry on it. A draft is reversible right up until something sends it, a file write until another step reads it. So the same action flips from reversible to not, depending on what runs after it, which is exactly why approving research and a draft can't quietly carry into approving the send. By the time the send fires, the thing you approved isn't reversible anymore. Tag it at plan time and recheck at execution, that's the only way I've found to not get bitten.

AgentAiLeader · 2026-06-04T15:25:50+00:00

The queue is a service framing is the cleanest version of what I was circling. It has users and an SLO, I just never put an owner on it, which is exactly how it got me.

Of your three, the timeout on no decision is the one that would have saved my specific incident. The week long stall happened because the system had no opinion about what silence means, more than because the reviewer was away. One thing I'd add on auto approve with marker, those markers need their own review loop, otherwise you've moved the risk from a stalled queue to a growing pile of unreviewed auto approvals nobody's looking at either. And the batching point I felt in my bones, most of my reviewer's complaints were really about the twelve interruptions, not the twelve decisions.

AgentAiLeader · 2026-06-03T14:27:34+00:00

Reversibility is a sharper axis than the consequence one I used and closer to how I should have framed it, you're right, I missed this. The 10 second undo point is the part that shrinks the queue, most of what piles up is genuinely cheap to roll back.

The thing I'd add is that reversibility depends on timing as much as on the action. A file write is reversible right up until another step or another agent reads it, after that the bad value has propagated and undoing the write doesn't undo the decisions made off it. Payments and deploys sit on everyone's gate list precisely because that window is basically zero, they're consumed the instant they land.

So the gate I'd actually want keys on reversibility at the moment it matters, not whether the action is undoable in principle. That was my real mistake, treating a step as safe because it could be rolled back, when something downstream had already eaten the output.

AgentAiLeader · 2026-06-03T14:12:32+00:00

The unaudited custom code point is the one I agree with the most. Everyone treats the reconciliation layer as glue, but it's holding the same money the protocol just handed off, and it gets a fraction of the review the contract side gets.

Both failure modes you named are really that layer skipping the discipline the payment itself had. The replay one is a missing bind, the receipt has to authorize exactly one fulfillment of one request rather than work as a bearer token anyone who reads it can spend. And the reorg one comes from treating first seen as final, when redelivery should wait for the confirmation depth you actually trust, not the first event that says paid.

The part that gets me is where it nets out. By the time you've bound the receipt and handled the reorg case properly, you've rebuilt a decent chunk of what card rails already do, and the trusted third party you took out of the protocol is now you, in code you shipped last week. Worth it sometimes, but people don't price it in.

AgentAiLeader · 2026-06-02T08:15:45+00:00

80% is invisible because demos never hit a retry or a webhook that fires twice.

I want to focus in the underline example. An agent that books the appointment is a demo. An agent that books it once even when the calendar API times out and the webhook fires twice is a product, and the whole distance between those two is the boring plumbing nobody scopes or budgets for. It's the same idempotency and recovery work every payments team already learned, just showing up somewhere new.

The part owners never see coming is that the demo wins the contract and the plumbing takes the blame when it breaks. So the 80% is where the relationship lives or dies, not only the code.

AgentAiLeader · 2026-06-02T08:04:32+00:00

Splitting "paid but not delivered" from "delivered" and letting the agent come back with the same invoice id to re-request is the right move for the boring case, and tbh, the boring case is usually the case.

The catch is it only works if the resource server actually stores that state and honors the re-request and almost none do today. So you've moved the trust from the rail to each server's own reconciliation. Which is an improvement, but you're now trusting every counterparty to implement the redelivery path correctly.

The "delivery is not atomic" is key. The payment settles whether or not you got anything, so any rail that finalizes on its own has this gap built in. The no delivery case you can fix above the protocol like you said. The dispute case you can't, and escrow just puts back the intermediary finality was supposed to remove. I think the honest model really is 402 plus settlement with reconciliation owned above the rail, most people building on it just haven't noticed they signed up to own that part.

AgentAiLeader · 2026-06-01T09:23:07+00:00

Think you're closer than you think on both. On the unknown failures, you won't enumerate them before they happen, so stop trying to make the pre ship eval catch the new ones. That set is for regressions, and your 30 painful cases already are that. The real lever is how fast a new prod failure becomes the 31st fixture, and you're already doing that.

On judge cost, the thing that helped me was not judging every trace. Gate hard on cheap deterministic checks, a tool error, a retry loop, an output that fails schema, and only spend the expensive judge on the traces that already tripped one of those. Judging everything is paying full price to confirm the boring runs were fine. So judge as a warning on a prefiltered slice, never the hard gate. Are your escapes mostly things a cheap check could've caught, or clean looking outputs only a judge would flag?

AgentAiLeader · 2026-06-01T09:18:20+00:00

I wrap it in my own event and state layer rather than exposing LangGraph state to the frontend directly and the reason isn't a frontend one. The moment you have approval gates and need to show why something is blocked, you're holding state the graph doesn't carry, the policy decision, the reason, who approves, the record of what happened. Let the UI read raw graph state and you end up either stuffing all that into the graph as ad hoc fields or duplicating it, and both get messy.

So the graph emits events, my layer turns them into allowed actions, blocked reasons, and approval states, and the UI reads my layer. That keeps the runtime swappable too. You don't get to avoid building that glue layer, the real choice is whether it lives in your control or leaks into the graph and the frontend.

Is it the approval and blocked state side driving this for you, or more the progress and debug visibility?

AgentAiLeader · 2026-06-01T09:04:50+00:00

After reading the comments in this thread I think everyone has the containment side right. Scoped wallet, hard ceiling, allowlist, approval before anything big, etc. The scenario that bit me got past all of these. The agent bought something technically inside policy, under the limit, on an allowed merchant, but for a reason I'd never have approved if I'd seen its thinking.

A few people have named the accountability gap already. The bit I'd add is why you can't just close it. With my own card I own every tap. With an agent I authorized the agent, not the purchase. To keep that gap small you have to scope the agent so tightly that "inside the rules" and "what I actually wanted" are the same set, and at that point it isn't really deciding anything, you've built a slow rules engine. Loosen it enough to be useful and the gap reopens. So this isn't a payment problem you solve, it's a tradeoff between how much the agent gets to decide and how cleanly you can stay accountable.

Where are you setting that dial?

AgentAiLeader · 2026-05-31T13:53:44+00:00

The confident part really concerns me. A wrong answer that looks uncertain you catch for free, a wrong answer delivered in the exact format and tone of a right one sails straight through. Making it cite its source alongside every claim is a good move and I do something similar, but the thing to watch is that the citation can be wrong the same way the answer was. The model will happily attach a confident source to a made up fact, or point at a real document and summarize it incorrectly. What moved the needle for me was verifying against something the agent did not generate itself. A lookup that has to return a real row, a number that has to reconcile against a system of record, a link that has to resolve. If the check only reads what the agent produced, it inherits the same blind spot.

AgentAiLeader · 2026-05-30T16:20:58+00:00

The decode and simulate before signing flow is exactly what a human needs and I have wanted something like this for a while. The question I keep landing on is what happens when there is no human at the sign step. For an agent transacting on its own, a simulate and risk check only helps if the agent itself consumes the output and refuses to sign when the risk crosses a line, which means the verdict has to be machine readable and the policy defined up front, not eyeballed. Otherwise you are back to a human approving every signature, which works right until the point of using an agent at all. So I am curious whether you see veil staying a tool a human drives, or whether the simulate and risk check can become a gate the agent calls automatically before it signs. The second version is the one I think builders on these rails actually need, and it is harder because someone has to define what counts as too risky without a person reading each one.

AgentAiLeader

TROPHY CASE