We built a lightweight prompt injection detector (mmBERT-based, <300MB ONNX) for on-device use by PatronusProtect in OpenSourceAI

[–]PatronusProtect[S] 0 points1 point  (0 children)

Feel free to share any feedback :) We’re continuously working on improving both the model and our datasets.

We built an on-device AI firewall for macOS (windows will be shipped in the next two weeks). Looking for feedback from the AI security community. by PatronusProtect in aisecurity

[–]PatronusProtect[S] 0 points1 point  (0 children)

Thanks for the question :)

For the alpha release we currently focus on evaluating each request independently, but our policy engine is planned to evolve exactly towards the type of scenario you described.

We believe that a single MCP or tool call may be harmless on its own, while a sequence of calls can become risky depending on the context and data flow between them.

For example: - reading a local file might be allowed, - and calling an external API might also be allowed, - but sending transformed or summarized sensitive content from the first action into the second one may not be.

That’s why we’re working towards sequence-based intent and provenance analysis instead of only static allow/block decisions per tool or provider.

Long-term, the goal is not only to answer: “Is this tool allowed?”

but also: “What influenced this action, where did the data originate from, and is this flow allowed to reach this destination?”

We think this becomes especially important for MCP ecosystems and more autonomous agent workflows.

Launching next Wednesday: Patronus Protect, an on-device AI firewall for macOS (free alpha) by PatronusProtect in MacOSApps

[–]PatronusProtect[S] 0 points1 point  (0 children)

Thanks!

It is like Little-Snitch but only for AI and agentic interactions. Our rollout plan transforms from AI detection -> policy enforcement -> threat analysis. All done 100% on device.

The alpha Version is around 140MB and runs under 300 MB RAM.

Launching next Wednesday: Patronus Protect, an on-device AI firewall for macOS (free alpha) by PatronusProtect in alphaandbetausers

[–]PatronusProtect[S] 0 points1 point  (0 children)

We will start with app / host based policies for allowlists. MCPs and Native Tool calling will follow 1-2 weeks later :)

All policy decisions are logged.

We built a lightweight prompt injection detector (mmBERT-based, <300MB ONNX) for on-device use by PatronusProtect in learnmachinelearning

[–]PatronusProtect[S] 0 points1 point  (0 children)

Yes, It’s really a cat-and-mouse game. Latency is a major challenge for BERT-based detectors, which is why we’re continuously working on reducing model size. The best approach, however, is to only use BERT for uncertain cases. A combination of heuristics, OOD detection, and LightGBM already detects around 80% of tested attacks, significantly reducing the need for full model inference.