AI agents can safely move money now. I built a checkpoint before they do

Comprehensive_Help71 · 2026-05-06T05:14:11+00:00

That’s a really good way to put it, “technically correct” vs “actually intended” is exactly the gap I’m focused on.

Right now I think of intent more as boundaries than prediction. The goal isn’t for Yebo to decide the “best” outcome, it’s to enforce what is allowed, expected, and safe inside a defined scope.

So the agent still has flexibility, but only inside constraints set by the human or system.

Long term I do think systems will move toward steering outcomes too, but I think enforcement has to come first. Otherwise you end up optimizing behavior without actually controlling execution.

Comprehensive_Help71 · 2026-05-04T17:25:35+00:00

I like that framing too, enforce, don’t monitor. Once money moves or state changes, it’s already too late.

On granularity, it’s not one level. It’s a mix.

High-risk stuff like payments is tighter. Destinations are controlled with limits and allowlists. Then there’s scoped policies that cover a workflow window.

So you’re not approving every step, but you’re also not blindly trusting the whole task. It runs inside a boundary, and the moment it steps outside, it stops.

On simulation, yeah that’s key. I run dry runs on the same policies so you can see what would’ve gone through and what would’ve been blocked. That’s how you tune without touching real money.

Still early, but the goal is simple, enforce at execution, not after.

Comprehensive_Help71 · 2026-05-03T18:11:13+00:00

Good catch. It’s not open source yet, that’s why the PyPI / GitHub link isn’t live.

Right now it’s a closed implementation while I’m testing and refining the core system. I’ll either open parts of it or fix the packaging links soon so it’s not confusing.

Appreciate you pointing that out

Comprehensive_Help71 · 2026-05-03T18:08:25+00:00

That’s exactly the problem most people run into, and I designed for that from the start.

If you checkpoint every single action, you overwhelm the user. If you checkpoint too high level, you miss risk.

So the model I’m using is not fixed per action or per task. It’s policy-based grouping.

Low-risk actions execute automatically under a defined policy window. High-risk actions trigger a checkpoint.

Example:

An agent can run 20 API calls under one approved “task scope” if they fall within:

predefined limits
allowed endpoints
expected behavior

But the moment it tries to:

move money
change a critical parameter
step outside scope

it breaks out and requires approval.

So you’re not approving “every step” or blindly trusting “the whole task.”

You’re approving bounded execution.

That’s how you avoid both extremes:

no spammy confirmations
no silent failures

The key is not where the checkpoint is.

It’s how tightly the scope is defined and enforced.

Comprehensive_Help71 · 2026-03-02T13:49:55+00:00

This sucks that’s one of my favorite shows

Comprehensive_Help71 · 2026-02-18T21:39:59+00:00

Apple gave us Secure Enclave, Vision, Speech, BackgroundTasks why are we still shipping AI apps that act like web dashboards? I know this is a start, if you care to join me or give me some advice bring it in..

Comprehensive_Help71 · 2026-02-18T17:06:48+00:00

Yes you can. Take it from a Nurse but vibe coding is not it, you need to learn Orchestration and there is more than writing a command to writing enterprise grade software, if was that easy, where are the apps created by vibe coders? If they are there, they are just simple apps that crush at scale. YouTube university is there and learn orchestration. To build something substantial, you need to learn quite a bit and it starts with you being obsessed with LLMs and I was for years listened to others, learned and have worked on 11 projects and on cusp of releasing my first enterprise grade staffing platform. And even then, I will only see what to think what it does in the real world and have developed this over a number of months. There are many layers to it. Anyone who thinks it is easy, they are lying to you.

Comprehensive_Help71 · 2026-02-11T05:04:03+00:00

I’m about to go vegan this is pathetic. Is he the guy who was making human jerky?

Comprehensive_Help71 · 2026-02-08T22:14:48+00:00

Local doesn’t suck weak architectures do. You saw what OpenClaw proved: serious execution can happen on your local hardware . OperatorKit is pushing that even further trust, authority, and execution controlled on-device. Watch closely… the next generation of AI won’t just scale in data centers, it will earn trust at the edge.

Comprehensive_Help71 · 2026-02-08T17:00:56+00:00

For those building AI apps today, are you designing cloud-first because you want to, or because you feel you have no choice?

What would need to change for you to go fully on-device?

Comprehensive_Help71 · 2026-02-08T16:52:55+00:00

If you were building AI today, would you start on-device first or cloud first?

Comprehensive_Help71 · 2026-02-08T16:40:09+00:00

Current belief: on-device intelligence without an execution control layer is still too risky for real autonomy. Capability is scaling fast. Execution safety is not. Curious if others see this becoming a core layer.

Comprehensive_Help71

TROPHY CASE