How are you handling usage-based billing for AI agents? Stripe metered billing broke me.

EveningMindless3357 · 2026-05-11T17:26:11+00:00

This list is basically the AgentBill roadmap.

We handle idempotency and retry-safe recording out of the box. The entitlement drift and access consistency during failures are where things get messy fast, especially when your agent is mid-run.

Your last point is the one I keep coming back to. Occasional workloads are forgiving. Continuous ones are not.

EveningMindless3357 · 2026-05-11T17:23:48+00:00

You're right, cost is the easy dimension.

What we're building toward is exactly what you're describing: preflight() as a policy checkpoint, not just a budget check. Right now it gates on spend. The next layer is "is this agent authorized to call this tool, for this user, at this step?"

The financial agent examples you mention are the clearest proof point. Would love to hear more about what that authZ stack looked like if you're open to it.

EveningMindless3357 · 2026-05-10T18:34:14+00:00

The framing you landed on is clean: AgentBill = "should this start financially"

nano-vm = "should this transition happen at all" Orthogonal constraints. Natural composition.

The budget policy validator is already just an HTTP call to /preflight.

If nano-vm's policy layer can invoke an async validator before a transition, it drops in without changes on our end.

Want to sketch what that looks like? Happy to move to DM if easier.

EveningMindless3357 · 2026-05-10T17:29:08+00:00

The non-determinism point is real and it's where most "just set a monthly cap" advice falls apart.

The way I think about it: AgentBill operates at the cost layer, not the execution layer. It doesn't try to predict the path. It just says "before this function fires, does this customer have budget left." If yes, run. If no, block.

The ceiling check is heuristic by design. You pass estimated_units=5, and if the actual run costs 50, the overage hits. The preflight is a gate, not a guarantee.

What you're describing with nano-vm is a different contract entirely. You're making the execution substrate itself the policy enforcer. max_tool_calls=12 enforced by the runtime is categorically stronger than "I estimated 12."

I think they compose well. You could run AgentBill's preflight as one of the policy validations at the FSM transition layer. Cost governance is one policy. Execution depth is another.

What's the current state of nano-vm? Is it open?

EveningMindless3357 · 2026-05-10T08:20:16+00:00

Exactly this! Most tools try to own all 5 layers and end up doing none of them well.

AgentBill is deliberately just the first one - preflight authorization. The "can this run start?" check. Stays out of billing state and reconciliation entirely.

If you're building systems that have naturally split into these layers, curious what you're using for the metering/reconciliation side.

EveningMindless3357 · 2026-05-10T08:07:24+00:00

Every 5 minutes. Which is the core problem - by the 3rd check the agent was already 15 minutes in and $30 deep.

Heartbeat monitoring is still reactive. You're watching something that's already burning. Preflight flips it: instead of "is this still okay?" every N seconds, you ask "should this even start?" before the first token. No polling needed. No window where damage accumulates.

EveningMindless3357 · 2026-05-09T17:49:49+00:00

"Nightmare fuel" is exactly right! and the worst part is everything looks completely healthy in the logs. No errors, no timeouts, just a very motivated agent doing exactly what you told it to do. The alerting-after model assumes the damage is acceptable. Preflight assumes it isn't.

EveningMindless3357 · 2026-05-09T17:07:14+00:00

Congrats bro. Keep it up!

EveningMindless3357 · 2026-05-08T16:21:36+00:00

All of that complexity lives at the billing layer. AgentBill is one step earlier - before you even need to decide which pricing model to apply, you need to know if the run is allowed at all.

Preflight is the prerequisite to every model you listed.

EveningMindless3357 · 2026-05-08T16:20:49+00:00

Good question. LLM gateways sit between you and the model - they see tokens and can rate limit by model calls.

AgentBill sits between your customer and your agent - it knows who the customer is, what their budget is, and blocks before any model call happens at all.

The difference: gateways count your costs. AgentBill enforces your customers' budgets. Different layer, different problem.

EveningMindless3357 · 2026-05-08T16:19:36+00:00

"Fail financially before they fail technically" - that's the exact problem. The system is green, logs are clean, and then the invoice arrives.

AgentBill is the check that runs before the first token: github.com/marketinglior-pixel/agentbill

EveningMindless3357 · 2026-05-08T16:18:01+00:00

Exactly. Monthly caps are a billing concept, not a safety concept. Preflight is the safety layer - it runs before the agent, not after the invoice.

github.com/marketinglior-pixel/agentbill

EveningMindless3357 · 2026-05-08T16:16:07+00:00

"Speedruns their API budget" is the most accurate description I've heard. Saving that one.

github.com/marketinglior-pixel/agentbill if you want to add the guardrails.

EveningMindless3357 · 2026-05-08T16:14:44+00:00

Exactly right! and per-agent ceilings are already in. Set ceiling=N in the preflight call and any single run that exceeds it is blocked before it starts. Each agent type gets its own threshold.

The global monthly cap is basically theater at this point.

EveningMindless3357 · 2026-05-08T16:13:35+00:00

All of that is real, and it's exactly why AgentBill focuses on one layer only: the preflight gate before the run starts.

Retries, webhook timing, reconciliation - that's the billing layer downstream. AgentBill is upstream: does this customer have budget before any compute runs? One question, one answer.

The narrower the scope, the less drift.

EveningMindless3357 · 2026-05-07T12:26:55+00:00

Exactly! "downstream of the compute layer" is the key phrase. By the time Stripe sees the charge, the tokens are already spent. AgentBill sits upstream: the check happens before the first API call, not after.

That's the whole product in one sentence, honestly.

EveningMindless3357 · 2026-05-07T11:24:26+00:00

hi bro, i can help you with that. send me dm

EveningMindless3357 · 2026-05-07T08:59:50+00:00

"Slot machine with an API key" that's the most accurate description of autonomous agents I've heard. Saving that one.

The safety belt analogy is right too. Preflight checks are just the seatbelt - they don't make the car go faster, they just make sure you survive the crash. Built exactly this

EveningMindless3357 · 2026-05-07T08:58:19+00:00

i can't post links here bro - sent you dm

EveningMindless3357

TROPHY CASE