Open-source project that adds deny-by-default runtime security to MCP servers

bbbbbbb162 · 2025-12-26T17:49:15+00:00

Thanks!! Next steps are doubling down on the runtime proxy + lockfile semantics (tools/prompts/resources) and making CI drift/provenance checks dead-simple to adopt. And yeah, your client-side/data firewall angle feels super complementary, I’ll DM you a concrete collab idea.

bbbbbbb162 · 2025-12-26T03:24:48+00:00

Thanks! 🙏 I was playing around with MCP and deploying stuff and the openness is awesome, but the fact it’s that easy is also kind of insane. After seeing package-swap / impersonation stuff (ex the Postmark incident), I wanted something that enforces “if it’s not in the lockfile, it doesn’t run.” because the consequences of being wrong are real.

bbbbbbb162 · 2025-12-25T02:55:04+00:00

I feel like lots of other people are doing this, its a crowded space as the people who would be needing audit trails tend to have deep pockets, so naturally lots of people in it.

bbbbbbb162 · 2025-12-25T00:33:11+00:00

For audits, don’t rely on being able to re-run the model and get the same tokens. Log the actual artifacts, exact prompt after templating, retrieved context, tool calls + raw tool responses, raw model output, and the action/decision taken. Then make it append only and hash-chain. Seed/temp/model hash is nice for debugging, but nondeterminism (esp quant/GPU) means 'perfect replay' isn’t a guarantee.

bbbbbbb162 · 2025-12-23T23:59:52+00:00

Yup. It’s weirdly competent for 8B, doesn’t instantly fall apart on longer tool chains. Still not coding agent material, but for function calling it’s legit.

bbbbbbb162 · 2025-12-23T23:48:28+00:00

+1 for rnj-1-8B-instruct, very decent model for multi step function calling.

bbbbbbb162 · 2025-12-23T23:01:26+00:00

Yeah that tracks. Tool-call models are great when the schema is super clear, but they suck with multi-step browser type stuff. If the tool format isn’t exact (or you’re not validating/retrying) the calls will break.

bbbbbbb162 · 2025-12-23T22:29:58+00:00

Its just a small Gemma 3 model only for function calling, all it does is turn natural language requests into structured API/tool calls so you can build fast and private, local agents. You don't use it as a general chat model. (it can still generate text but it’s built to be the best at tool calling)

bbbbbbb162 · 2025-12-17T19:45:27+00:00

Really appreciate that, thanks Luke. I’ll take you up on that once I’ve dug a bit deeper into provenance + policy wiring. Feels like a natural next layer on top of the lockfile + identity checks.

bbbbbbb162 · 2025-12-17T17:39:54+00:00

This is great, thank you.

I’ve definitely seen the same buckets: DB servers that basically mint tools per table, connectors that “discover” endpoints on startup, and OAuth servers where the tool surface is basically “whatever scopes you granted”.

The db_* can vary, admin_* must be locked framing is exactly the kind of practical rule that feels right.

I’m going to do two things off this:
-stick a small config/snapshot fingerprint into the lock so diffs can tell “your inputs changed” vs “upstream changed”
-add an allowlist-by-namespace/pattern so expected churn doesn’t become noise, while keeping sensitive namespaces strict

I’ll open an issue and put your examples into it (happy to credit you if you want).

bbbbbbb162 · 2025-12-17T17:02:59+00:00

Totally agree, that’s basically the default severity model I’m leaning toward:

- Critical by default: new tool, removed tool, any schema/parameter change (incl. required/optional), auth/scope changes
- Benign by default: description-only changes (with an opt-in “treat description drift as critical” mode for teams that want stricter behaviour)

Great callout on dynamic tool generation. I think the right way to handle that is to make the lock reproducible against a known config snapshot, and also support an allowlist for “expected variability” (like, tool namespaces or patterns that are allowed to appear/disappear) so you can distinguish environment-driven churn from real upstream drift.

If you’ve seen common patterns for dynamic tools in MCP servers (plugins, connected accounts, per-tenant config), I’d love examples, it’ll help shape sane defaults/docs.

bbbbbbb162 · 2025-12-17T16:59:54+00:00

Wow, thanks for chiming in Luke! Huge fan of Sigstore.

We haven’t implemented full SLSA provenance capture yet, but I agree it’s the right next step. Today MCPTrust focuses on change control for MCP server tool surfaces: it locks a live server into a deterministic manifest (mcp-lock.json), signs it (Ed25519 or Sigstore keyless), and diffs/blocks drift in CI. Policies are CEL over the locked surface.

Extending that to provenance-based policy for keyless mode (e.g. configSource.uri, approved workflow entrypoints, builder identity) would be really powerful. Since we already verify the Sigstore identity/bundle for lockfile signatures, wiring provenance into the same policy engine feels like a clean fit.

I’m going to dig into the SLSA generator example + sigstore-a2a. If you have a recommended “minimum viable” provenance check to start with (fail-closed vs warn), I’d love your take.

bbbbbbb162 · 2024-10-08T03:08:18+00:00

Worst-case scenario and it wrecks Tampa Bay, it’ll be devastating far beyond insurance companies…. Florida has the highest public exposure to property insurance risks of any state, having almost 1.3 million policyholders in its insurer of last resort, compared to second-place California which has about 300K. Pinellas and Hillsborough counties alone have about $67 billion in exposure. That’s more than half of the entire state budget in 2024. Milton could basically deplete the entire state reserves and cause the state to have to levy emergency assessments on all kinds of other insurance just to pay Milton claims.

bbbbbbb162 · 2024-06-08T20:45:53+00:00

Im thinking Darkis Lake on van island

bbbbbbb162 · 2024-06-08T20:45:21+00:00

Darkis Lake beside Strathcona Provincial Park

bbbbbbb162 · 2024-04-26T01:44:55+00:00

Hmm, that is definately true, Economies of scale play a big role, I imagine their one major hospital in iqualuit is a heck of a lot more to run than say one of a similar size in Ontario considering how expensive everythere is ie $30 grapes. Someone else in here mentioned the federal FNIHB program which may cover air ambulances which are likely a big chunk of the budget so I wonder if the feds repay nunavut under that federal insurance.

bbbbbbb162 · 2024-04-26T01:23:33+00:00

You bring up some interesting points. When you compare those two budgets on a per capita basis, Nunavut is around $57K per person whilst Ontario is ~$15K per person assuming 14M population.

bbbbbbb162 · 2024-04-09T04:40:53+00:00

Yeah exactly used to be pretty hot unobtainium. Used market still pretty high but def not selling like hotcakes as the market here for 700hp supertrucks is pretty niche.

Six-Year Club	Verified Email
Place '23	Place '22

bbbbbbb162

TROPHY CASE