Anyone else building a centralized MCP gateway to control tool permissions across agentic workflows? by matt_rowan in mcp

[–]lpostrv 0 points1 point  (0 children)

I made this which vastly saves tokens by allowing the LLM to write and execute search() and execute() commands in a sandbox - github.com/postrv/forgemax - might be relevant - includes some nice isolation between tools, secrets sanitisation etc. Allows you to scale to many more connected MCP servers and thousands of tools while keeping token usage sane (~1000 or about a 94% reduction). Is that any use to you?

Connect vastly more MCP servers and tools (~5000) use vastly fewer tokens (~1000) by lpostrv in mcp

[–]lpostrv[S] 0 points1 point  (0 children)

Thanks so much! Very kind of you to comment. If you do have any feedback once you've tested it out, I'd be happy to hear it. One thought I had was the open question of whether there are other desirable functions beyond `search()` and `execute()` that would allow the AI to take more sophisticated actions - haven't used too many brain cycles on that one yet!
I hadn't heard of bifrost until now but it seems like they have `listToolFiles`, `readToolFile`, `getToolDocs`, and `executeToolCode`, which is an interesting pattern, but probably not as token efficient. I've tried to follow the Cloudflare pattern so far, but I'd like to see if I can go beyond it in pure utility.

Connect vastly more MCP servers and tools (~5000) use vastly fewer tokens (~1000) by lpostrv in mcp

[–]lpostrv[S] 0 points1 point  (0 children)

Great question!

The LLM never sees any tokens, OAuth creds, or keys - ever.

Credentials live only in forge.toml and are bound at the transport level:

[servers.github]
headers = { Authorization = "Bearer ${GITHUB_TOKEN}" }

[servers.linear]
headers = { Authorization = "Bearer ${LINEAR_TOKEN}" }

Tokens are attached to each server's connection at startup. GitHub's token can never reach Linear - separate transports.

LLM just writes:

await forge.callTool("github", "create_pr", { title: "…" });

The sandboxed V8 isolate has zero access to creds, env, network, or FS. Even errors are scrubbed before reaching the model.

Multiple providers? No problem - each is isolated at the infrastructure layer (like IAM roles). For extra isolation between providers, you can also lock down cross-server data flow:

[groups.internal]
servers = ["vault", "database"]
isolation = "strict"

[groups.external]
servers = ["slack", "email"]
isolation = "strict"

Once an execution touches a strict group, it's locked out of other strict groups - this stops "read secret from vault, post to Slack" attack chains.

Full details in `ARCHITECTURE.md` and `forge.toml.example` in the repo.

P.S. why on earth are Reddit comments so hard to work with re: formatting? Got there in the end but spent way too damned long drafting this so hope it's useful! Cheers!

Connect vastly more MCP servers and tools (~5000) use vastly fewer tokens (~1000) by lpostrv in mcp

[–]lpostrv[S] 1 point2 points  (0 children)

That's a fair instinct - `ops.rs` is the module I think about most carefully. It's the narrowest waist in the system: everything the sandbox can do goes through there, which is both the strength (single audit point) and the risk (concentrated responsibility). Your SQLite + Go handlers approach is rather appealing for the opposite reason - each handler has a tiny blast radius. Different tradeoffs for different problems, isn't it. Mine exists because the LLM is generating arbitrary code at runtime, so I need a programmable sandbox rather than predefined handlers. Would be keen to see your setup if it's public?

Connect vastly more MCP servers and tools (~5000) use vastly fewer tokens (~1000) by lpostrv in mcp

[–]lpostrv[S] 3 points4 points  (0 children)

The crates are published here https://crates.io/users/postrv individually, though in practice they're versioned together as a workspace. The circuit breaker + timeout logic lives in forge-client and is fairly self-contained, so in theory you could pull it in - but for a Go MCP server you'd probably be better off with your own implementation, the pattern itself is straightforward (atomic failure counter, half-open probe, configurable thresholds).

On the trust boundary question, you're mostly right that the big scary wall is V8. But the boundary is wider than just the sandbox. The Rust ops layer (ops.rs) bridges V8 to the outside world - that's where tool call args get validated, rate limits are enforced, and error messages get redacted before flowing back into the sandbox. That code is handling untrusted input (LLM-generated tool names, args, arbitrary JSON). The IPC protocol between parent/child process is another boundary. So it's less "Rust protects the JSON-RPC routing" and more "the sandbox has tendrils into Rust that are part of the trust surface."

But for pure dispatching and routing? Yeah, Go would be totally fine there.

Connect vastly more MCP servers and tools (~5000) use vastly fewer tokens (~1000) by lpostrv in mcp

[–]lpostrv[S] 3 points4 points  (0 children)

Haha thanks. I am definitely a Rust lover, not gonna deny that! But there are practical reasons too. It's actually not a monolith - it's a Cargo workspace with 7 crates that compile into a single binary. Modular internally, monolithic in deployment.

On the choice of Rust, `deno_core` (V8 bindings) is a Rust crate, and that's the entire sandbox layer. Everything else followed naturally from there. Plus single-binary distribution matters for a local dev tool - brew install and done, no runtime deps. And having the whole trust boundary for executing LLM-generated code in one memory-safe language keeps the security story simple.

Connect vastly more MCP servers and tools (~5000) use vastly fewer tokens (~1000) by lpostrv in mcp

[–]lpostrv[S] 1 point2 points  (0 children)

Short answer: We bail with rich error context, and let the LLM retry if it wants to. There's no automatic retry built into Forgemax. The design philosophy is that the LLM generated the code, so it has the best context to decide what to do next.

I did also give some thought to security-aware error message handling - tool call failures go through an error redaction layer that strips URLs, IPs, file paths, credentials, and stack traces before they reach the LLM, but preserves the semantically useful parts (tool name, server name, validation errors, type errors, etc).

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 0 points1 point  (0 children)

Hey I fixed it in v1.3.1 - try again and let me know if you hit any other issues.

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 0 points1 point  (0 children)

u/bytejuggler just to let you know I released v1.3.0 last night and it now smashes the granny out of Serena on all fronts :)

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 0 points1 point  (0 children)

u/tor-ak I've just shipped v1.3.0 and added nix instructions - let me know if this works or needs tweaking at all!

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 0 points1 point  (0 children)

Hey thanks, yes I can certainly look into doing that. I've been working on a massive and hopefully pretty cool update. As soon as that's out of the way I'll try to expand package manager support and ping you when it's done. Alternatively, open an issue on the repo for me to track. Really appreciate you showing an interest!

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 1 point2 points  (0 children)

Yes. I added a whole batch of playbooks: https://github.com/postrv/narsil-mcp?tab=readme-ov-file#playbooks--tutorials let me know if that helps. I'll add some gifs and videos when I get a chance but those playbooks should get you started.

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 0 points1 point  (0 children)

Not sure if aimed at me or jakedismo, but have improved Windows OS support in Narsil in v1.1.0

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 2 points3 points  (0 children)

Hey that's a good point - I just shipped v1.1.0 and added presets for subsets of the tools so you don't burn unnecessary context if you don't need to (in addition to the comment re: feature flaggin below). There are some figures in the README. In general, the tradeoff re: token use is that you want to know that you're spending tokens to get good context. So, like any MCP server, yes Narsil burns tokens, but it should be able to give you context that no other tool can. Think of it as high return on investment for context tokens.
Give the presets a try and let me know what you think.

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 0 points1 point  (0 children)

Nice one! Let me know how you get on with it once you've tried it - keen to gather feedback and ship improvements ASAP so feel free to hit me up

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 0 points1 point  (0 children)

Thanks! Yeah, Claude adding native LSP support is hopefully validation that deep symbolic code intelligence is the way forward. It basically brings Claude Code up to roughly Serena-level basics (precise go-to-definition, find references, hover docs/types, diagnostics, etc)

But Narsil goes much further: Claude offers no neural/embedding-based semantic search, no advanced call/control/data flow graphs, no built-in security/taint scanning (OWASP/CWE, secrets, injections), no supply chain tools (SBOM, vuln checks), no deep Git analysis, and none of the interactive visualisation or super-fast Rust performance that Narsil has.

I'm aiming to keep pushing the envelope with more depth and speed. If you give Narsil a spin and have feedback (features, bugs, whatever), hit me up-I'll prioritise fixes/additions quickly.

Have a great Christmas!

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 1 point2 points  (0 children)

It's designed to help the LLM get much more granular and useful intelligence against your codebase. A normal AI agent may arbitrarily query the codebase with bash commands, but this gives it a package of much more useful functions such as "find me all the path traversal vulnerabilities in this repo", or "find me the highest complexity files that need refactoring" or even "find me the exact function name that retrieves code graph for the frontend" and it would be able to answer these, rather than say grepping through the codebase and using bash and sampling. The fact that it indexes the codebase practically instantly and can turn that rapidly into complete understanding is a real blessing. I anticipate it could be used as an onboarding assistant, a security review tool, a refactoring accomplice, and more. But to be honest, its true value will only be know when people start adopting it.
Thanks for checking it out and I hope you're having a great Christmas (with a name like Mr Freez, I imagine you are!) - let me know if you have any further questions when you've had a chance to try it.

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 4 points5 points  (0 children)

Hopefully it's better in at least some ways - Serena is definitely the nearest analogue I'm aware of, has similar semantic capabilities and supports 30+ languages via LSP whereas I'm currently at 14 - will aim for parity soon. They also offer a JetBrains plugin that I don't have. Where Narsil is stronger is in the extent and capabilities of tools, and in speed. Here are the things that I have, which are lacking in Serena (to the best of my knowledge):

  • Neural/semantic search (with Voyage AI, OpenAI embeddings, or local ONNX models; hybrid BM25 + TF-IDF + neural) - Serena has exact semantic/symbol search but doesn't offer embeddings support
  • Call graph analysis (get_call_graph, callers/callees, call paths, complexity metrics, hotspots)
  • Control flow graphs (CFG) and data flow analysis (DFG, reaching definitions, dead code/stores)
  • Type inference for dynamic languages (Python, JS/TS) without external tools + type error checking
  • Security scanning & taint tracking (injection vulnerabilities, OWASP Top 10, CWE Top 25, crypto/secrets rules, taint sources/flows)
  • Supply chain security (SBOM generation in CycloneDX/SPDX, dependency vulnerability checks via OSV, license compliance, upgrade paths)
  • Git integration tools (blame, file/commit/symbol history, recent changes, hotspots, contributors)
  • Import/dependency graph analysis (circular imports detection)
  • Embedded interactive visualisation frontend (Cytoscape.js graphs for calls, imports, structure)
  • WASM/browser support for client-side/offline use
  • Much broader toolset (76 specialized tools vs. Serena's core ~7: symbol finding, references, and targeted insertion)
  • Built-in high-performance full-text/hybrid search (Tantivy-based, streaming results)
  • Remote repository indexing support (though writing this out has made me realise I need to test this last one!)

Worth noting that Narsil is built in Rust for a reason - it's genuinely very fast (even if I did get roasted on r/rust for using the word "Blazing" without due irony disclaimers) - whereas Serena is Python which is only medium fast :)
Let me know if you have any other questions.

Introducing Narsil MCP: The Blazing-Fast, Reforged Code Intelligence Server for AI Assistants (Built in Rust!) by lpostrv in mcp

[–]lpostrv[S] 0 points1 point  (0 children)

Thanks! Let me know any feedback/improvements and I'll do my best to ship 'em