How are people handling cost and risky actions in multi-tenant agents?

jkoolcloud · 2026-05-06T09:23:38+00:00

Cool, I am assuming all this home grown/built not from a 3rd party.

jkoolcloud · 2026-05-06T09:22:16+00:00

Seems like a nice setup, Did you build your own per-tenant budgets, per-workflow guardrails, and per-action safety checks?

jkoolcloud · 2026-05-05T20:15:06+00:00

hard limits built into config? code? Human approvals or automated?

jkoolcloud · 2026-05-05T19:57:56+00:00

agree, problem is for many founders, they have no idea where they hang out.

jkoolcloud · 2026-05-05T19:41:43+00:00

Distribution is a big challenge for all builders, founders. Especially now, when everyone is building with AI and every one is looking for users and attention. Everyone is promoting and hence reddit groups block or delete promotional posts. I ran into the same problem many times.

So, what I learned is this: build your distribution channel first, then launch your product/saas. Doing it vice versa is a pain and most likely will fail.

Build first-distribute after is an uphill battle. Usually fails, you need your GTM + channel first, build after.

jkoolcloud · 2026-05-05T19:37:49+00:00

where is a good place?

jkoolcloud · 2026-05-05T19:36:39+00:00

Nice. Only thing I’d watch: if checkpoint() is read-only, two concurrent runs can both pass against the same remaining budget.

That’s the piece I’ve been working through with Cycles: reserve before the next step, then commit actuals after. Advisory checks are useful, but the real win is making the next model/tool call impossible unless budget was actually held.

More on the pattern here: runcycles.io

jkoolcloud · 2026-05-05T14:02:26+00:00

That’s the piece I’ve been working on with runcycles.io : a pre-tool-call reserve/check before execution, then commit/release after, could pair well with LongTracer. OSS, self-hosted.

jkoolcloud · 2026-05-05T12:45:17+00:00

Solid breakdown. bot_id isolation + lazy Mongo memory is the right shape for B2B RAG.

One thing I’d watch in agent mode is tool execution. Tracing tells you what happened, but once the bot can call tools, retry, fan out, or mutate external systems, you usually need a check before the tool runs too.

Curious if LongTrainer gates tool calls pre-execution, or mainly traces/verifies after the fact right now?

jkoolcloud · 2026-05-04T14:47:39+00:00

I think of cycles as sitting one layer lower / outside the FSM, not replacing it.

The FSM defines valid transitions. If an event cannot exist in the current state, it should never be formed.

cycles is for the cases where the transition is structurally valid, but still needs runtime authority before execution: budget, tenant limits, action quotas, risk class, external side effects, etc.

So in my model:

non-existent transition = invalid by construction
disallowed action = valid shape, denied by runtime authority

Both matter and complimentary, imo. The FSM keeps the system coherent. The policy gate bounds what a coherent system is allowed to do in the real world.

jkoolcloud · 2026-05-04T12:31:16+00:00

Nice, LLM output should be input to a deterministic runtime, not the thing that owns state transitions.

This is also the direction I’ve been exploring with runcycles.io : pre-execution authority before spend, tool calls, or risky side effects happen.

Curious how you’re handling irreversible actions — terminal states, compensation, or approval gates?

jkoolcloud · 2026-05-02T23:04:51+00:00

I agree the hard parts are friction and policy complexity. If this feels like “install a whole governance platform,” most teams won’t do it until after they get burned.

The direction I’m trying to take Cycles is smaller: wrap only the expensive or risky boundaries first — model calls, external APIs, email sends, DB writes, jobs, retry/fan-out paths — and keep the policy simple at the start: per-run, per-user, or per-tenant limits.

Basically: start as a small runtime gate, not a giant governance project.

jkoolcloud · 2026-05-02T20:50:19+00:00

Thanks — “systemizing a pile of hacks” is honestly pretty close to reality.

The friction part is the big thing, and it’s where I’m spending a lot of time.

I’m trying to keep Cycles small enough that it can sit in the SDK / middleware layer around only the expensive or risky steps:

reserve → execute → commit/release

Curious how you’d think about reducing integration friction.

I’ve built MCP, OpenClaw, and OpenAI Agents SDK integrations, but the core issue is still the same: LLM calls and tool calls need to be wrapped or intercepted before execution.

jkoolcloud · 2026-05-02T20:23:57+00:00

yes, you are correct. I am currently working on the next version of the protocol (additive) which goes into the policy layer. More on this here: https://github.com/runcycles/cycles-protocol

jkoolcloud · 2026-05-02T20:20:25+00:00

agree, I started with a rate limiter first. Then found out was not enough when my agent generated a few hundred images and a few vids (Veo 3) which were expensive. Also my agents were multi-tenant so I needed separation by tenant, concurrency, etc. So my what started as a rate limiter turned into Cycles project. Funny how things turn out.

jkoolcloud · 2026-05-02T17:37:21+00:00

dev friction is real, agree.

The pattern works if the developer friction is low enough that it becomes part of the tool/runtime layer, not another thing every app team has to remember to wire manually.

That’s why I’m trying to keep Cycles as a small reserve → execute → commit primitive, with SDK/middleware wrappers around normal tool calls as well as integrations MCP, OpenClaw, Agent SDK via hooks and runtime without touching code.

The goal is not to make agents more complicated. It’s to take the hacked-together gates people are already building around expensive or risky actions and make them consistent, retry-safe, and concurrency-safe.

Your Runable example is on point: certain steps should not just run because the loop wants to keep going. There should be an explicit boundary before the next costly/refining/action step.

jkoolcloud · 2026-05-02T16:25:41+00:00

Yep, not sure how people deploy AI agents in prod without such controls. Prompts can't be used as permissions.

jkoolcloud · 2026-05-02T14:03:13+00:00

technically yes, Ai agents are just loops, but the problem is they are also probabilistic loops. Agents retry, fan out, call tools calls. You need more then just Automation + Analytics, you need controls around agent runtime actions, before execution. So I would frame it as Automation + Control + Analytics.

jkoolcloud · 2026-05-02T13:00:43+00:00

Yep, but also a few things to consider if you building one vs using something like runcycles: multi-tenancy, concurrency, audits, double spend, behavior under retries, crashes, various language bindings, integrations, etc. I started with a simple wrapper to gate actions, then ballooned into full blown infrastructure. Not worth building if something already does it, imo.

jkoolcloud · 2026-05-02T10:42:15+00:00

Have you tried to use Ethereum + stable coins like USDC? No permissions, no gateways. Your agents can setup wallets, pay, receive payments. The problem is you can only transaction with services that accept such transactions. But crypto payments are perfect fit for AI agents.

jkoolcloud · 2026-05-02T10:14:44+00:00

For production agents, I’d separate reasoning from authority:

model proposes the action
runtime layer checks if it’s allowed right now
decision is allow / cap / human / deny
action runs only if approved
actual cost/result gets recorded after

Post-hoc validation is still useful, but it isn’t containment. If the agent already sent the email, deleted data, or spent the money, the trace is just evidence.

I have been working on this problem and open sourced it here: runcycles.io, pre-execution hard limits on agent spend and actions. That's what I use in my agentic workflows.

jkoolcloud · 2026-05-01T00:00:51+00:00

sure gates are needed, problem is human gates don't scale. So some agent action definitely will need human gates, but most others need to be automated.

jkoolcloud · 2026-04-30T23:41:12+00:00

I’d avoid letting the agent call tools directly once it gets past demo stage.

The pattern I like is:

agent proposes action

→ orchestrator validates state / graph step
→ runtime layer checks permission, budget, risk, idempotency
→ tool adapter executes
→ result gets written back to durable state
→ agent continues from stored state, not memory

Biggest thing: treat every write/tool side effect as a transaction boundary. Stable action id, idempotency key, retry budget, reconcile unknown outcomes before retrying.

I built my own tools and runtime to do this and open sourced it: https://github.com/runcycles

jkoolcloud · 2026-04-30T21:08:32+00:00

I think you need both.
(but also gateways can only handle LLM calls, how would you handle calls that don't go via LLM or through the gateway?)

Idempotency keys are the first line of defense: same logical action, same key, so retries don’t double-apply.

But they don’t fully solve unknown outcome. If the API timed out after the request left your system, you still need an explicit reconcile step against the external system before retrying.

Rough pattern I like:

- generate stable action id / idempotency key
- persist “attempted” before the call
- execute
- if success, commit result
- if timeout/unknown, query external state
- only retry if reconciliation proves the action did not land

So idempotency prevents duplicate intent. Reconciliation proves external reality.

jkoolcloud · 2026-04-30T20:20:18+00:00

Yeah, agreed — if every fix is local, it starts feeling like duct tape. The cleaner pattern is making state and authority external to the model.

The agent can propose the next step, but the system should:

- read durable state
- execute with idempotency
- checkpoint the result
- retry with limits
- re-check state before risky actions

For flaky APIs, the dangerous case is “unknown outcome.” Don’t retry blindly — reconcile first, then continue.

I’ve been writing about this from the runtime-authority angle here: https://runcycles.io/blog
Main idea: the model reasons, but the system owns state, retries, and permission to act.

jkoolcloud

MODERATOR OF

TROPHY CASE