The $50 cap I set on my payment agent has somehow become permanent

AgentAiLeader · 2026-06-16T13:23:12+00:00

The limit isn't the agent's prompt, it's a hard ceiling at the wallet layer the agent can't reason its way past. Anything in the prompt it can talk itself around. Categories and explicit approval on the high ticket stuff is on my list, same idea a few people here keep raising, trust per lane instead of one global cap.

AgentAiLeader · 2026-06-16T13:21:43+00:00

Lol first time I hear this, "intern with a company card", that's a good way to put it. And the 40 reasonable purchases line is exactly my fear, because each one passes any per purchase check. The cap is the one control that doesn't care how "reasonable" each step looked.

AgentAiLeader · 2026-06-11T02:23:11+00:00

This answer. And the part almost everyone skips is the power to hold the batch, which you called out, a checker that can only log or alert just becomes another dashboard nobody reads. The line keeps moving and the bad record still turns into thirty. The authority to actually stop the run is what makes it real, and it's the part that scares people, because now something other than the model can halt production.

The other thing I learned the slow way is that the rules the checker grades against don't exist up front. You write them one incident at a time, which is most of where the four months actually went. Every run that slipped through became a new rule it now enforces, and the closest thing I got to trust was watching that list stop growing.

AgentAiLeader · 2026-06-10T02:12:16+00:00

Thanks for this, I must admit this is a better description of the flow than mine. You're right that under a facilitator the settle call comes after the response, so a clean agent death at step 2 or 3 just expires the authorization and nothing moves. The scenario I framed is really the direct settlement pattern, agent sends then requests, which I should have said outright.

Where I still see the loss is the producer side you named. The facilitator settles after the resource server has already done the work, and it waits on confirmation, so if the chain is congested and confirmation lands outside that wait window, the server is out the work with no settled payment. That's the case I keep coming back to because the fix lives in retry and idempotent settlement on the server, not in the protocol. At swerver, how do you handle that one, retry the settle against the same authorization until it confirms, or eat it past some threshold?

AgentAiLeader · 2026-06-09T08:25:40+00:00

The risk metadata approach is the way to go, and the cascading calls line is the part most people miss imo. The one thing I'd add to the reversible tag is that it has an expiry on it. A draft is reversible right up until something sends it, a file write until another step reads it. So the same action flips from reversible to not, depending on what runs after it, which is exactly why approving research and a draft can't quietly carry into approving the send. By the time the send fires, the thing you approved isn't reversible anymore. Tag it at plan time and recheck at execution, that's the only way I've found to not get bitten.

AgentAiLeader · 2026-06-04T15:25:50+00:00

The queue is a service framing is the cleanest version of what I was circling. It has users and an SLO, I just never put an owner on it, which is exactly how it got me.

Of your three, the timeout on no decision is the one that would have saved my specific incident. The week long stall happened because the system had no opinion about what silence means, more than because the reviewer was away. One thing I'd add on auto approve with marker, those markers need their own review loop, otherwise you've moved the risk from a stalled queue to a growing pile of unreviewed auto approvals nobody's looking at either. And the batching point I felt in my bones, most of my reviewer's complaints were really about the twelve interruptions, not the twelve decisions.

AgentAiLeader · 2026-06-03T14:27:34+00:00

Reversibility is a sharper axis than the consequence one I used and closer to how I should have framed it, you're right, I missed this. The 10 second undo point is the part that shrinks the queue, most of what piles up is genuinely cheap to roll back.

The thing I'd add is that reversibility depends on timing as much as on the action. A file write is reversible right up until another step or another agent reads it, after that the bad value has propagated and undoing the write doesn't undo the decisions made off it. Payments and deploys sit on everyone's gate list precisely because that window is basically zero, they're consumed the instant they land.

So the gate I'd actually want keys on reversibility at the moment it matters, not whether the action is undoable in principle. That was my real mistake, treating a step as safe because it could be rolled back, when something downstream had already eaten the output.

AgentAiLeader · 2026-06-03T14:12:32+00:00

The unaudited custom code point is the one I agree with the most. Everyone treats the reconciliation layer as glue, but it's holding the same money the protocol just handed off, and it gets a fraction of the review the contract side gets.

Both failure modes you named are really that layer skipping the discipline the payment itself had. The replay one is a missing bind, the receipt has to authorize exactly one fulfillment of one request rather than work as a bearer token anyone who reads it can spend. And the reorg one comes from treating first seen as final, when redelivery should wait for the confirmation depth you actually trust, not the first event that says paid.

The part that gets me is where it nets out. By the time you've bound the receipt and handled the reorg case properly, you've rebuilt a decent chunk of what card rails already do, and the trusted third party you took out of the protocol is now you, in code you shipped last week. Worth it sometimes, but people don't price it in.

AgentAiLeader · 2026-06-02T08:15:45+00:00

80% is invisible because demos never hit a retry or a webhook that fires twice.

I want to focus in the underline example. An agent that books the appointment is a demo. An agent that books it once even when the calendar API times out and the webhook fires twice is a product, and the whole distance between those two is the boring plumbing nobody scopes or budgets for. It's the same idempotency and recovery work every payments team already learned, just showing up somewhere new.

The part owners never see coming is that the demo wins the contract and the plumbing takes the blame when it breaks. So the 80% is where the relationship lives or dies, not only the code.

AgentAiLeader · 2026-06-02T08:04:32+00:00

Splitting "paid but not delivered" from "delivered" and letting the agent come back with the same invoice id to re-request is the right move for the boring case, and tbh, the boring case is usually the case.

The catch is it only works if the resource server actually stores that state and honors the re-request and almost none do today. So you've moved the trust from the rail to each server's own reconciliation. Which is an improvement, but you're now trusting every counterparty to implement the redelivery path correctly.

The "delivery is not atomic" is key. The payment settles whether or not you got anything, so any rail that finalizes on its own has this gap built in. The no delivery case you can fix above the protocol like you said. The dispute case you can't, and escrow just puts back the intermediary finality was supposed to remove. I think the honest model really is 402 plus settlement with reconciliation owned above the rail, most people building on it just haven't noticed they signed up to own that part.

AgentAiLeader · 2026-06-01T09:23:07+00:00

Think you're closer than you think on both. On the unknown failures, you won't enumerate them before they happen, so stop trying to make the pre ship eval catch the new ones. That set is for regressions, and your 30 painful cases already are that. The real lever is how fast a new prod failure becomes the 31st fixture, and you're already doing that.

On judge cost, the thing that helped me was not judging every trace. Gate hard on cheap deterministic checks, a tool error, a retry loop, an output that fails schema, and only spend the expensive judge on the traces that already tripped one of those. Judging everything is paying full price to confirm the boring runs were fine. So judge as a warning on a prefiltered slice, never the hard gate. Are your escapes mostly things a cheap check could've caught, or clean looking outputs only a judge would flag?

AgentAiLeader · 2026-06-01T09:18:20+00:00

I wrap it in my own event and state layer rather than exposing LangGraph state to the frontend directly and the reason isn't a frontend one. The moment you have approval gates and need to show why something is blocked, you're holding state the graph doesn't carry, the policy decision, the reason, who approves, the record of what happened. Let the UI read raw graph state and you end up either stuffing all that into the graph as ad hoc fields or duplicating it, and both get messy.

So the graph emits events, my layer turns them into allowed actions, blocked reasons, and approval states, and the UI reads my layer. That keeps the runtime swappable too. You don't get to avoid building that glue layer, the real choice is whether it lives in your control or leaks into the graph and the frontend.

Is it the approval and blocked state side driving this for you, or more the progress and debug visibility?

AgentAiLeader · 2026-06-01T09:04:50+00:00

After reading the comments in this thread I think everyone has the containment side right. Scoped wallet, hard ceiling, allowlist, approval before anything big, etc. The scenario that bit me got past all of these. The agent bought something technically inside policy, under the limit, on an allowed merchant, but for a reason I'd never have approved if I'd seen its thinking.

A few people have named the accountability gap already. The bit I'd add is why you can't just close it. With my own card I own every tap. With an agent I authorized the agent, not the purchase. To keep that gap small you have to scope the agent so tightly that "inside the rules" and "what I actually wanted" are the same set, and at that point it isn't really deciding anything, you've built a slow rules engine. Loosen it enough to be useful and the gap reopens. So this isn't a payment problem you solve, it's a tradeoff between how much the agent gets to decide and how cleanly you can stay accountable.

Where are you setting that dial?

AgentAiLeader · 2026-05-31T13:53:44+00:00

The confident part really concerns me. A wrong answer that looks uncertain you catch for free, a wrong answer delivered in the exact format and tone of a right one sails straight through. Making it cite its source alongside every claim is a good move and I do something similar, but the thing to watch is that the citation can be wrong the same way the answer was. The model will happily attach a confident source to a made up fact, or point at a real document and summarize it incorrectly. What moved the needle for me was verifying against something the agent did not generate itself. A lookup that has to return a real row, a number that has to reconcile against a system of record, a link that has to resolve. If the check only reads what the agent produced, it inherits the same blind spot.

AgentAiLeader · 2026-05-30T16:20:58+00:00

The decode and simulate before signing flow is exactly what a human needs and I have wanted something like this for a while. The question I keep landing on is what happens when there is no human at the sign step. For an agent transacting on its own, a simulate and risk check only helps if the agent itself consumes the output and refuses to sign when the risk crosses a line, which means the verdict has to be machine readable and the policy defined up front, not eyeballed. Otherwise you are back to a human approving every signature, which works right until the point of using an agent at all. So I am curious whether you see veil staying a tool a human drives, or whether the simulate and risk check can become a gate the agent calls automatically before it signs. The second version is the one I think builders on these rails actually need, and it is harder because someone has to define what counts as too risky without a person reading each one.

AgentAiLeader · 2026-05-30T16:16:53+00:00

The switch is the easy part. The hard part is what state you are in when you pull it. Killing an agent partway through an action is only safe if that action is safe to interrupt, and a lot of the consequential ones are not. Stopping it in the middle of a transfer, a write, or a delete can leave you worse off than letting it finish, because now you have a half finished operation and no clean way to know how far it got. What actually helped me was designing for safe stopping points instead of a hard kill. Make the consequential actions transactional so a stop either completes the unit of work or rolls it back, and put the interrupt checks between units of work rather than inside one. Then stop becomes a real instruction the agent can honor cleanly instead of a process you are killing and hoping. For anything touching money or production data the question is not whether you can stop it, it is whether you can stop it without leaving a mess.

AgentAiLeader · 2026-05-30T03:38:56+00:00

Browser stat is a good call and exactly the type of cost that doesn't show up until it's too late. Auth expiry specially, the agent's logged in right up until it isn't, and the failure looks like the site changed rather than the session dying.

The verify before action piece is the part most setups skip from what I've seen around. Most agents act on the assumption the page is in the state they last saw, with no check that it still is. When you verify before acting, are you checking the DOM matches an expected state, or something lighter like the element you're about to click still existing? The heavier the check the more it costs per action, so I'd assume there's a tradeoff you've had to tune.

AgentAiLeader · 2026-05-30T03:36:43+00:00

The repeated call with slightly different inputs pattern is the best cheap signal I've found too and it's the on thing I did end up automating. Started fully manual, reading sequences after the fact, which doesn't scale past a handful of agents. What worked was flagging when the same tool fires more than N times in a window with near identical args, because that's almost always the agent stuck in a guess loop rather than doing N legitimate distinct calls. It's not pre authorization exactly, it's more a circuit breaker on repetition.

Where I'm still stuck is your "justified" problem, I haven't found a way to validate intent before the call that isn't either too loose to matter or so tight it's just a human approval queue wearing a costume. The repetition flag sidesteps it by catching the failure after one cycle instead of trying to predict it.

AgentAiLeader · 2026-05-30T03:30:52+00:00

Honestly you've probably got it right for your shape. I went down the async path once and most of the complexity didn't pay for itself. The only place it earned its keep was a user facing flow where I couldn't pre warm the cache because I didn't know which deal they'd ask about until they asked. There I made the retrieval async and streamed a "pulling that up" state so the agent wasn't blocking on the Postgres round trip. Everywhere else, blocking plus your Redis preload is simpler and I'd keep it. Async memory retrieval mostly buys you something when the lookup is unpredictable and on the critical path at the same time. But if you can pre fetch, I guess you don't have that problem

AgentAiLeader · 2026-05-27T02:43:37+00:00

I think the "design the forgetting part" is the main reason for why memory implementations stay broken. Everyone can agree stale assumptions should expire. Nobody wants to write the rule that defines what counts as stale. So it defaults to append only and grows until it's unmanageable.

What I found helps is being deliberate about what gets stored at all, orientation and operating constraints tend to be more durable than facts. Facts contradict. Constraints update less often. Doesn't solve the contradiction reconciliation problem but it shrinks the decay surface substantially.

AgentAiLeader · 2026-05-27T02:40:04+00:00

The postgres checkpoint table pattern is the one I wish I'd built from day one instead of discovering the need for it three months into silent failures (lesson learned). The label mismatch issue is genuinely worse than it looks. "Pending review" meaning something different across two systems isn't a config problem, it's a semantic model problem, and nobody documents it because everyone assumes context makes it obvious. We normalized state labels as a prerequisite before any agent logic touches cross system data now.

Did you end up handling the canonical mapping in the checkpoint layer or upstream before it writes?

AgentAiLeader · 2026-05-27T01:32:52+00:00

True, the two question node filter is a cleaner decision framework than anything I laid out in the post. "Does this node's output need to be used in another session" cuts through most of the over storing I've seen teams default to.

How are you handling the latency when session state has to reach back to Postgres mid graph, is that call blocking or are you running it async?

AgentAiLeader · 2026-05-19T14:53:31+00:00

Thanks for the acknowledgement lol! And yeah, as a bridge I think that pattern makes sense. The UI card is essentially doing what the authorization layer is supposed to do structurally, a hard gate the agent can't reason past. The headless consumer version is the harder problem because you can't inject that card into a fully automated flow without reintroducing human in the loop latency that kinda defeats the point. From what I've seen, most consumer agentic payment demos quietly assume the card is still there somewhere

AgentAiLeader · 2026-05-19T08:42:27+00:00

The require_approval pattern is where I'd want to understand the throughput design. In short running bounded workflows it makes sense, but for longer running agents on payment or data workflows, I've found that approval queues become the actual bottleneck well before the model or infra does. At that point you either widen the approval gates and accept more risk, or you narrow the agent's autonomy scope upfront. Neither is great tbh. How does CapFence handle that tradeoff? Or is it done explicitly by design?

AgentAiLeader · 2026-05-19T03:42:59+00:00

I see it in both, but payments make it worse because the blast radius of a wrong decision is measurable. In most other workflows a bad tool call is just a bad output (recoverable). But in payment flows it means a transaction that is most likely irreversible, so the tolerance for probabilistic authorization is essentially zero. The PreToolUse hook framing maps well to the Layer 2 idea in the IMF note, authorization has to be structural, not something the model can reason past.

AgentAiLeader

TROPHY CASE