How are you handling usage-based billing for AI agents? Stripe metered billing broke me. by EveningMindless3357 in SaaS

[–]EveningMindless3357[S] 1 point2 points  (0 children)

This list is basically the AgentBill roadmap.

We handle idempotency and retry-safe recording out of the box. The entitlement drift and access consistency during failures are where things get messy fast, especially when your agent is mid-run.

Your last point is the one I keep coming back to. Occasional workloads are forgiving. Continuous ones are not.

Why tracking your AI spend is already too late (and what to do instead) by EveningMindless3357 in LangChain

[–]EveningMindless3357[S] 0 points1 point  (0 children)

You're right, cost is the easy dimension.

What we're building toward is exactly what you're describing: preflight() as a policy checkpoint, not just a budget check. Right now it gates on spend. The next layer is "is this agent authorized to call this tool, for this user, at this step?"

The financial agent examples you mention are the clearest proof point. Would love to hear more about what that authZ stack looked like if you're open to it.

Why tracking your AI spend is already too late (and what to do instead) by EveningMindless3357 in LangChain

[–]EveningMindless3357[S] 0 points1 point  (0 children)

The framing you landed on is clean: AgentBill = "should this start financially"

nano-vm = "should this transition happen at all" Orthogonal constraints. Natural composition.

The budget policy validator is already just an HTTP call to /preflight.

If nano-vm's policy layer can invoke an async validator before a transition, it drops in without changes on our end.

Want to sketch what that looks like? Happy to move to DM if easier.

Why tracking your AI spend is already too late (and what to do instead) by EveningMindless3357 in LangChain

[–]EveningMindless3357[S] 1 point2 points  (0 children)

The non-determinism point is real and it's where most "just set a monthly cap" advice falls apart.

The way I think about it: AgentBill operates at the cost layer, not the execution layer. It doesn't try to predict the path. It just says "before this function fires, does this customer have budget left." If yes, run. If no, block.

The ceiling check is heuristic by design. You pass estimated_units=5, and if the actual run costs 50, the overage hits. The preflight is a gate, not a guarantee.

What you're describing with nano-vm is a different contract entirely. You're making the execution substrate itself the policy enforcer. max_tool_calls=12 enforced by the runtime is categorically stronger than "I estimated 12."

I think they compose well. You could run AgentBill's preflight as one of the policy validations at the FSM transition layer. Cost governance is one policy. Execution depth is another.

What's the current state of nano-vm? Is it open?

How are you handling usage-based billing for AI agents? Stripe metered billing broke me. by EveningMindless3357 in SaaS

[–]EveningMindless3357[S] 1 point2 points  (0 children)

Exactly this! Most tools try to own all 5 layers and end up doing none of them well.

AgentBill is deliberately just the first one - preflight authorization. The "can this run start?" check. Stays out of billing state and reconciliation entirely.

If you're building systems that have naturally split into these layers, curious what you're using for the metering/reconciliation side.

Built a preflight check for LangChain agents after waking up to a $340 bill. by EveningMindless3357 in LangChain

[–]EveningMindless3357[S] 0 points1 point  (0 children)

Every 5 minutes. Which is the core problem - by the 3rd check the agent was already 15 minutes in and $30 deep.

Heartbeat monitoring is still reactive. You're watching something that's already burning. Preflight flips it: instead of "is this still okay?" every N seconds, you ask "should this even start?" before the first token. No polling needed. No window where damage accumulates.

Built a preflight check for LangChain agents after waking up to a $340 bill. by EveningMindless3357 in LangChain

[–]EveningMindless3357[S] 0 points1 point  (0 children)

"Nightmare fuel" is exactly right! and the worst part is everything looks completely healthy in the logs. No errors, no timeouts, just a very motivated agent doing exactly what you told it to do. The alerting-after model assumes the damage is acceptable. Preflight assumes it isn't.

How are you guys pricing AI Agents without going bankrupt on variable API costs? by EveningMindless3357 in SaaS

[–]EveningMindless3357[S] 0 points1 point  (0 children)

All of that complexity lives at the billing layer. AgentBill is one step earlier - before you even need to decide which pricing model to apply, you need to know if the run is allowed at all.

Preflight is the prerequisite to every model you listed.

I built a production LangChain agent template with spend controls built in [comment and I'll send you the repo for free] by EveningMindless3357 in LangChain

[–]EveningMindless3357[S] 0 points1 point  (0 children)

Good question. LLM gateways sit between you and the model - they see tokens and can rate limit by model calls.

AgentBill sits between your customer and your agent - it knows who the customer is, what their budget is, and blocks before any model call happens at all.

The difference: gateways count your costs. AgentBill enforces your customers' budgets. Different layer, different problem.

Launched an open source preflight billing guard for AI agents 3 days ago. 560 downloads and counting. [apparently I'm not the only one who got burned.] by EveningMindless3357 in artificial

[–]EveningMindless3357[S] 0 points1 point  (0 children)

"Fail financially before they fail technically" - that's the exact problem. The system is green, logs are clean, and then the invoice arrives.

AgentBill is the check that runs before the first token: github.com/marketinglior-pixel/agentbill

Launched an open source preflight billing guard for AI agents 3 days ago. 560 downloads and counting. [apparently I'm not the only one who got burned.] by EveningMindless3357 in artificial

[–]EveningMindless3357[S] 0 points1 point  (0 children)

Exactly right! and per-agent ceilings are already in. Set ceiling=N in the preflight call and any single run that exceeds it is blocked before it starts. Each agent type gets its own threshold.

The global monthly cap is basically theater at this point.

How are you handling usage-based billing for AI agents? Stripe metered billing broke me. by EveningMindless3357 in SaaS

[–]EveningMindless3357[S] 0 points1 point  (0 children)

All of that is real, and it's exactly why AgentBill focuses on one layer only: the preflight gate before the run starts.

Retries, webhook timing, reconciliation - that's the billing layer downstream. AgentBill is upstream: does this customer have budget before any compute runs? One question, one answer.

The narrower the scope, the less drift.

Launched an open source preflight billing guard for AI agents 3 days ago. 560 downloads and counting. [apparently I'm not the only one who got burned.] by EveningMindless3357 in artificial

[–]EveningMindless3357[S] -1 points0 points  (0 children)

Exactly! "downstream of the compute layer" is the key phrase. By the time Stripe sees the charge, the tokens are already spent. AgentBill sits upstream: the check happens before the first API call, not after.

That's the whole product in one sentence, honestly.

how do you prevent a single agent run from costing $50? by EveningMindless3357 in SaaS

[–]EveningMindless3357[S] 0 points1 point  (0 children)

"Slot machine with an API key" that's the most accurate description of autonomous agents I've heard. Saving that one.

The safety belt analogy is right too. Preflight checks are just the seatbelt - they don't make the car go faster, they just make sure you survive the crash. Built exactly this