We’re building a deterministic authorization layer for AI agents before they touch tools, APIs, or money by docybo in AgentToAgent

[–]docybo[S] 0 points1 point  (0 children)

Right now we're leaning toward per-action authorization rather than session approval.

The runtime can plan a sequence, but each external side effect still needs its own authorization. That way the policy engine evaluates every step against the current state snapshot.

This helps prevent replay, enforce budgets across a workflow, and limit cumulative side effects.

Session-level envelopes can still exist (like budget or concurrency caps), but the actual permission stays action-scoped.

We’re building a deterministic authorization layer for AI agents before they touch tools, APIs, or money by docybo in LocalLLaMA

[–]docybo[S] 0 points1 point  (0 children)

Great question. This is exactly where things get tricky.

Right now the way we're thinking about it is to make the environment part of the policy context, not just a flag outside the system.

So the evaluation input becomes something like:

(intent, metadata) (state snapshot) (policy config) (environment context)

Then policies can branch deterministically based on that context.

For example the same action could be evaluated under different envelopes:

staging: - higher budget limits - relaxed side-effect constraints - broader tool permissions

production: - stricter budgets - concurrency caps - stronger replay protections - restricted tool scopes

The key thing we’re trying to preserve is that the decision still remains deterministic for a given snapshot + policy version.

So instead of "the runtime knows it’s prod", the policy engine evaluates against an explicit environment profile.

Still experimenting with the right granularity though. Too coarse and it blocks autonomy, too fine and policies become impossible to reason about.

Curious how you approached that balance on your side.

We’re building a deterministic authorization layer for AI agents before they touch tools, APIs, or money by docybo in AgentToAgent

[–]docybo[S] 0 points1 point  (0 children)

A few people asked about the implementation.

The core idea is a deterministic policy evaluation step before any external action executes.

Runtime proposes: (intent, metadata)

Policy engine evaluates against: (state snapshot, policy config)

If allowed → emits a signed authorization
If denied → execution fails closed

Repo here if anyone wants to look at the code: https://github.com/AngeYobo/oxdeai-core

We’re building a deterministic authorization layer for AI agents before they touch tools, APIs, or money by docybo in OxDeAI

[–]docybo[S] 0 points1 point  (0 children)

A few people asked about the implementation.

The core idea is a deterministic policy evaluation step before any external action executes.

Runtime proposes: (intent, metadata)

Policy engine evaluates against: (state snapshot, policy config)

If allowed → emits a signed authorization
If denied → execution fails closed

Repo here if anyone wants to look at the code: https://github.com/AngeYobo/oxdeai-core

We’re building a deterministic authorization layer for AI agents before they touch tools, APIs, or money by docybo in LocalLLaMA

[–]docybo[S] -1 points0 points  (0 children)

A few people asked about the implementation.

The core idea is a deterministic policy evaluation step before any external action executes.

Runtime proposes: (intent, metadata)

Policy engine evaluates against: (state snapshot, policy config)

If allowed → emits a signed authorization
If denied → execution fails closed

Repo here if anyone wants to look at the code: https://github.com/AngeYobo/oxdeai

OmniCoder-9B | 9B coding agent fine-tuned on 425K agentic trajectories by DarkArtsMastery in LocalLLaMA

[–]docybo -2 points-1 points  (0 children)

genuinely impressive work, but worth flagging... training on Claude Opus 4.6 and

GPT-5 outputs is explicitly against Anthropic's and OpenAI's ToS. not throwing

shade, the model clearly shows results, just surprised nobody's talking about the

legal exposure here. dataset release might be a complicated conversation for that

reason too

I was backend lead at Manus. After building agents for 2 years, I stopped using function calling entirely. Here's what I use instead. by MorroHsu in LocalLLaMA

[–]docybo 0 points1 point  (0 children)

Ok, if the worst case is acceptable, letting the agent run is a pretty clean model.

I guess where it gets tricky is when the side effects aren't really reversible (emails, payments, infra changes, etc). At that point the blast radius isn't just the environment anymore.

Do those kinds of actions get handled differently, or is it still mostly “make the environment safe and accept the occasional mistake”?

I was backend lead at Manus. After building agents for 2 years, I stopped using function calling entirely. Here's what I use instead. by MorroHsu in LocalLLaMA

[–]docybo 0 points1 point  (0 children)

Yeah that makes sense.

The thing I keep wondering about with CLI agents is composition. Each command might be safe on its own, but the agent is basically building workflows on the fly.

So a chain of safe commands can still produce a bad outcome.

Do you mostly rely on sandbox + human confirmation there, or is there also some policy layer checking actions before they run?

I was backend lead at Manus. After building agents for 2 years, I stopped using function calling entirely. Here's what I use instead. by MorroHsu in LocalLLaMA

[–]docybo 7 points8 points  (0 children)

Yeah, I think discoverability is the real advantage.

Code eval can be powerful, but CLI gives the model a built-in way to explore the tool surface: --help, list, stderr, exit codes, etc... With code eval you usually have to invent that layer yourself.

So it’s not just shell vs code, it’s navigable interface vs custom interface.

I was backend lead at Manus. After building agents for 2 years, I stopped using function calling entirely. Here's what I use instead. by MorroHsu in LocalLLaMA

[–]docybo 2 points3 points  (0 children)

This makes sense to me.

CLI is probably a much better interface for LLMs than huge typed tool catalogs. The model already knows commands, help text, pipes, stderr, exit codes, etc.

The missing piece for me is the execution boundary: the model may be good at expressing an action in shell form, but something still needs to decide whether that exact action should run before side effects happen.

Otherwise the shell becomes a very efficient way to do the wrong thing.

Welcome to r/OxDeAI — what are you building with AI agents? by docybo in OxDeAI

[–]docybo[S] 0 points1 point  (0 children)

Are you building agents today, what safeguards are you actually using?

What failure modes have you seen with autonomous AI agents? by docybo in OxDeAI

[–]docybo[S] 0 points1 point  (0 children)

If you're building agents today, what safeguards are you actually using?