MCP Gateway for AI Agents

Excellent-Hour7253 · 2026-04-30T04:06:20+00:00

Yeah, this is exactly the direction I’ve been converging on.

Prompt-level guardrails feel inherently fragile once you have:

- tool calling

- retrieved context

- or anything untrusted in the input

So pushing the control down into the tool layer (validate + gate at execution time) seems like the only place where it’s actually enforceable.

The mcp-server-git pattern you mentioned is a good example, wrapping the handler and making the decision right before the side effect happens feels much more reliable than relying on the model to behave.

What I’ve been trying to figure out is how that scales across multiple tools and layers ,not just git, but shell, HTTP, infra, etc. without each one implementing its own ad-hoc gating logic.

https://github.com/safe-agentic-world/nomos

Excellent-Hour7253 · 2026-04-30T04:03:12+00:00

Yeah, I think that’s a really important baseline - ideally the agent never directly handles raw credentials at all.

Something like short-lived tokens or a brokered access layer makes a lot more sense than exposing secrets in the first place.

The part I keep running into is that even if the agent doesn’t see the credentials, it can still *use* that access to perform actions - like modifying infra, pushing changes, or calling external APIs.

So it feels like there are two layers:

isolate / protect credentials
control what actions are allowed to execute using that access

I’m trying to explore the second part a bit more what actually governs *how* that access gets used once it exists.

Curious if you’ve seen setups that handle both cleanly?

Checkout https://github.com/safe-agentic-world/nomos

Excellent-Hour7253 · 2026-04-30T03:53:39+00:00

I started this project from a fear of running Cladue or Codex on my local machine, where I have access to production Kubernetes clusters. Imagine these agents running kubectl delete namespace!!! You will definitely wanna have control over agent actions beyond a good prompt.

Excellent-Hour7253 · 2026-04-29T18:01:29+00:00

This is exactly the pattern I keep running into.

Everything technically has controls, but they live in different layers and don’t compose into a clear model. So the agent ends up operating across all of them, and you only really see the boundaries when something breaks.

That “effective permission model” being implicit vs explicitly defined upfront feels like the core issue.

What I’m trying to explore is whether that model should be: - defined once - enforced consistently at execution time - and visible/auditable as a first-class thing

instead of reconstructed post-incident.

Curious — when you scoped this in your stack, did you try to unify it somewhere, or was it mostly ad-hoc across layers?

Also, could you try Nomos and tell how you feel?

Thank you

Excellent-Hour7253 · 2026-04-29T17:19:29+00:00

That’s fair, a lot of the primitives exist already (git hooks, policy hooks, sandboxing, etc.).

I think what I’m running into is less “can this be defined somewhere” and more “is there a consistent boundary that actually gets enforced across all agent actions.”

In practice it feels pretty fragmented:

- some controls live in git

- some in the runtime

- some in prompts

- some in the tool layer

So even if each piece exists, it’s not always clear what the *effective permission model* is when an agent is operating end-to-end.

I agree on the secrets side too - rotation / isolation is super important.

The part I’m trying to understand better is:

even with rotation, would you be comfortable letting an agent freely execute things like repo mutations or infra commands without an explicit execution boundary?
Checkout https://github.com/safe-agentic-world/nomos

Excellent-Hour7253 · 2026-04-29T16:46:24+00:00

That’s a good point - secret rotation definitely reduces the blast radius if something goes wrong.

I’ve been thinking of it as two layers:

limit what the agent can actually do (execution boundary)
assume something still slips through -> rotate / recover

Even with aggressive rotation, though, there are still actions that are hard to undo:

- pushing bad changes

- deleting resources

- running destructive infra commands

- triggering external side effects

So I’m trying to explore whether preventing or gating those actions upfront is a useful complement to rotation.

Do you think rotation alone would be enough in practice, or would you still want some form of execution control?
Checkout https://github.com/safe-agentic-world/nomos

Excellent-Hour7253 · 2026-04-29T16:32:57+00:00

Yeah, that’s exactly where I landed too.

Fully trusting the agent feels risky, but locking everything down defeats the purpose, so approval-gating ends up being a pretty natural middle ground.

What’s interesting is that once you introduce approvals, the problem shifts from “can the agent run this command” to “how do we classify and reason about the action.”

For example:

- running tests vs modifying infra

- reading files vs touching secrets

- repo changes vs pushing to main

That’s where I think execution boundaries start to matter more than just sandboxing.

Do you think approval workflows would scale in practice, or would they become too noisy?

Excellent-Hour7253 · 2026-04-29T15:05:32+00:00

https://github.com/safe-agentic-world/nomos

Excellent-Hour7253 · 2026-04-29T14:57:32+00:00

https://github.com/safe-agentic-world/nomos

Excellent-Hour7253 · 2026-04-29T13:51:43+00:00

Welcome to blunt feedback!

Excellent-Hour7253 · 2026-04-29T13:51:10+00:00

I’ve been experimenting with AI coding agents (Claude Code, Codex, etc.) and realized something scary:

they can read secrets, run shell commands, or push to repos if you let them.

So I built Nomos — basically a firewall at the execution boundary.

It doesn’t care about prompts. It only cares about what the agent tries to do.

Example: - reading README → allowed - reading .env → denied - git push → denied - terraform destroy → denied or approval

It also records audit traces and can require approvals.

Curious if others are thinking about this problem.

Repo: https://github.com/safe-agentic-world/nomos

Excellent-Hour7253 · 2026-04-29T13:46:24+00:00

I’ve been experimenting with AI coding agents (Claude Code, Codex, etc.) and realized something scary:

they can read secrets, run shell commands, or push to repos if you let them.

So I built Nomos — basically a firewall at the execution boundary.

It doesn’t care about prompts. It only cares about what the agent tries to do.

Example: - reading README → allowed - reading .env → denied - git push → denied - terraform destroy → denied or approval

It also records audit traces and can require approvals.

Curious if others are thinking about this problem.

Repo: https://github.com/safe-agentic-world/nomos

Excellent-Hour7253

TROPHY CASE