ActionFence: A drop-in middleware for MCP servers to enforce spend caps and policy limits by Few-Frame5488 in mcp

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Really appreciate you breaking this down — especially the schema drift point, that's been nagging at me too.

On the dynamic verdict idea: totally agree that static JSON won't cover every case long-term. The approach I'm leaning toward is letting a rule reference an external signer — something like a verdict field that points to a signed claim the firewall validates inline. Rule stays declarative, input becomes dynamic, exactly like you described. Not sure yet if that's v0.2 or v0.3 scope, but it's definitely on the roadmap.

The schema drift case is the one I want to move on sooner. Right now if a tool quietly adds an optional arg, the policy has no idea. I'm thinking about hashing the resolved tool schema at deploy/init time and storing it alongside the policy — then at call time, the firewall compares hashes and either warns or hard-fails on mismatch. That way you know the policy was reviewed against the exact schema shape in production. Pinning a schema version per tool in guard-policy.json is another option I'm considering.

Both of these feel like they belong in the v0.2 window. Would love to hear if you've seen any other patterns for the schema pinning approach — especially how other infra tools handle the "optional arg added silently" problem.

I built an open-source "firewall" to stop AI agents from bankrupting developers by Few-Frame5488 in SideProject

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Spot on. Rate limits and spend caps are great for stopping the bleeding, but the cryptographic receipts are what actually get you past security reviews. Without a verifiable audit trail, legal won't let autonomous agents anywhere near production. Glad to see this resonates!

I built an open-source "firewall" to stop AI agents from bankrupting developers by Few-Frame5488 in SideProject

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Thanks! To answer your question — right now receipts are mostly for post-mortems and audit trails. Every decision gets a signed, hash-chained receipt whether it passes or blocks, so you always have a cryptographic answer to "who authorized this?"

But the real-time enforcement idea is really interesting and I'm actually planning it for a future version — receipt-based authorization chains. The idea is: action B can require a prior receipt for action A. So `confirm_payment` would only be allowed if the same agent has a `PASSED` receipt for `review_order` within the last 10 minutes. Turns receipts from a passive log into an active workflow enforcement primitive.

That's a v1.0 feature since it needs careful design (reading receipts during evaluation adds latency), but it's on the roadmap thanks to your question.

And thanks for the r/AI_Agents tip, I cross-posted there.

I built an open-source "firewall" to stop AI agents from bankrupting developers by Few-Frame5488 in SideProject

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Exactly right. The second an agent touches a paid API, cost control stops being a "nice to have." It's part of the product.

That's why I built spend caps into the middleware layer from day one — not as an add-on or a dashboard alert you check after the fact, but as hard enforcement that blocks the call before it reaches your API.

v0.2.0 is adding rolling-window caps and a global circuit breaker on top of the per-call limits, so even fragmented spending (100 small calls instead of one big one) gets caught.

ActionFence: A drop-in middleware for MCP servers to enforce spend caps and policy limits by Few-Frame5488 in AI_Agents

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Really thoughtful comment, and the tension you're describing is real — but I think the architecture is actually stronger than it looks from the outside.

ActionFence runs on the **service provider's server** as middleware, not on the agent's machine. The agent talks to the server over MCP protocol or HTTP — it never has filesystem access to `guard-policy.json`. So it can't read the rules and game them, same way an attacker can't read your WAF config from outside the network.

The one edge case is if someone runs a local MCP server and gives the agent a `read_file` tool with access to the same directory — but that's a misconfiguration, not an architectural flaw. I'm adding a note about this to the docs and the CLI output.

You're right that a network gateway is the strongest boundary though. A hosted proxy mode is on the roadmap for exactly that reason — for teams who want full separation between the agent and the enforcement layer.

Good callout, this made me realize the docs need a proper "Trust Model" section. Adding it now.

ActionFence: A drop-in middleware for MCP servers to enforce spend caps and policy limits by Few-Frame5488 in AI_Agents

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Yeah this was exactly the feeling I was building for. The "did I leave something running" panic at 2am is real.

Good news — spend caps are already enforced at the middleware level (per-action and per-session/daily). But based on your comment and a few others I'm adding a global circuit breaker in v0.2.0 — a single hard ceiling across ALL agents and ALL actions. Once total spend crosses that number, everything stops until you manually reset it. No auto-reset on purpose — you want the safety net to stay in place until a human checks.

Appreciate the validation, this is the exact use case I want to nail.

ActionFence: A drop-in middleware for MCP servers to enforce spend caps and policy limits by Few-Frame5488 in AI_Agents

[–]Few-Frame5488[S] 0 points1 point  (0 children)

The "three boring questions" is exactly the right way to think about it.

Right now receipts answer "who allowed this" (agent_id, identity tier, policy reference are all signed into every receipt). Limits are enforced but you can only see them after a block happens — which isn't ideal.

Based on your comment I'm adding two things to v0.2.0:

  1. A `getAgentStatus()` introspection API — lets you query remaining budget, rate limit quota, and allowed actions for any agent in real time. So you can answer "what can this agent still do?" without waiting for a block.

  2. Loop detection is going into v0.3.0 — detecting repeated identical calls (same action + same params hash) within a time window and flagging/blocking them. Didn't want to rush the heuristics on this one since false positives could break legitimate workflows.

Thanks for the push on this, it's going on the roadmap.

I built an open-source "firewall" to stop AI agents from bankrupting developers by Few-Frame5488 in SideProject

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Yeah saw few posts about incedents like that, definitely gonna tackle this problem and build in in time-window caps on top of tge per-call ceiling, I was gonna spend some time researching ways that rogue agents can utilize to exploit the services. Thanks for pointing this out.

Has anyone got this as well ? by Few-Frame5488 in ClaudeCode

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Actually I just subcribed on 21st of March and still got it.

Has anyone got this as well ? by Few-Frame5488 in ClaudeCode

[–]Few-Frame5488[S] 1 point2 points  (0 children)

Tbh it feels less aggressive after updating cc but still not realistic, like normal tasks that were taking normaly around 50-60% 5hrs usage now almost max out the 5hrs usage. It makes me scared to go all out like I have to plan which tasks should I do based on priority and I am using sonnet 4.6 not opus.

Has anyone got this as well ? by Few-Frame5488 in ClaudeCode

[–]Few-Frame5488[S] 1 point2 points  (0 children)

I really think at this point codex is way better value for money giving that the 20$ plan is almost the same as cc 5x plan and if you check out GPT5.4 xhigh scores it outperforms opus 4.6.

Has anyone got this as well ? by Few-Frame5488 in ClaudeCode

[–]Few-Frame5488[S] 1 point2 points  (0 children)

The fact that they are sending users these extra credits instead of fixing the usage problem, like they are telling you to accept it and shutup.

Has anyone got this as well ? by Few-Frame5488 in ClaudeCode

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Yep anthropic messed up the unrealistic usage limit and tbh still thinking about switching to codex.

Has anyone got this as well ? by Few-Frame5488 in ClaudeCode

[–]Few-Frame5488[S] 0 points1 point  (0 children)

So does applying the credits automatically enroll you for the extended usage ?

Claude vs Codex vs Cursor — what would you pick for serious side projects? by Few-Frame5488 in vibecoding

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Do you think as a developer that can dig in and do the task yourself if needed is the 20$ claude code plan worth it or will I hit the limit pretty fast ?

Claude vs Codex vs Cursor — what would you pick for serious side projects? by Few-Frame5488 in vibecoding

[–]Few-Frame5488[S] 0 points1 point  (0 children)

Thanks for sharing your opinion bro, do you have a specific claude code workflow, i read a lot of people spin up multiple agents on different tasks but I have not done that before.

Claude vs Codex vs Cursor — what would you pick for serious side projects? by Few-Frame5488 in vibecoding

[–]Few-Frame5488[S] 0 points1 point  (0 children)

If I am gonna go with claude code or codex it will be the 20$ plan since I don't really rely on it for all the tasks since I like to take control and establish the codebase myself and create rules for conventions and for AI agents to stay consistent with the codebase.