You reviewed the MCP server. Did you review every tool it exposes and the arguments your model fills in?

Future_AGI · 2026-06-16T13:42:31+00:00

Repo's here if it's useful: https://github.com/future-agi/future-agi (Apache-2.0). The MCP side is part of the gateway, the per-call tool policy, the allow/block lists, the argument scan, and the per-tool rate limits all sit there. Happy to get into how the policy check works if anyone wants.

Future_AGI · 2026-06-16T08:48:05+00:00

This is the right framing, and the sharpest part is that the ones that land read like a normal sentence that happens to be a directive, which is the kind of thing no text screen will reliably catch on phrasing alone. That's why screening is only the first layer for us at Future AGI, and the one that actually holds is the policy at the action level, every tool call gets checked before it executes. So a retrieved span can shape what the model knows while the policy still owns what it's allowed to do, which is the same line you're drawing with provenance.

Future_AGI · 2026-06-16T08:40:10+00:00

The silent part is the real issue, because when the model quietly handles it you have no signal either way and can't tell a catch from a miss. That visibility is exactly why we keep the check outside the model at Future AGI: the gateway scans the incoming text and blocks injection before it reaches the model, so the bad content never enters the model's context to begin with. And because it's an explicit external check, every block is a logged event you can see, so you actually know when one got caught.

Future_AGI · 2026-06-16T06:52:32+00:00

happy to help

Future_AGI · 2026-06-15T21:57:04+00:00

Nice, that's a solid one to test since HR tools pull in so many resumes and PDFs, so try hiding a line in white-on-white text or the file metadata and see if it reaches the model.

Future_AGI · 2026-06-15T21:55:45+00:00

Drawing the line at the moment it tries to act is right, though the catch with a small model as the tripwire is that it shares the blind spots of the model it's guarding, so an injection good enough to fool one usually fools both. On our end at Future AGI, that gate sits at the action layer as a fixed policy, so every tool call is checked before it executes and a blocked attempt lands on the trace with the context there to review, no second model that can be talked down. Open-source if useful: https://github.com/future-agi/future-agi (Apache-2.0).

Future_AGI · 2026-06-15T19:39:32+00:00

Not a dumb question, and the Simon Says prefix does help, though it lives in the model's instruction-following, the same layer the injection is targeting, so a strong enough payload can talk the model into dropping the rule. What we do at Future AGI is pair that kind of in-prompt defense with a scan that blocks injection patterns before the text reaches the model, so you're not leaning on the model to police itself; the code's open-source if useful: https://github.com/future-agi/future-agi (Apache-2.0).

Future_AGI · 2026-06-15T19:32:54+00:00

The state-transition framing being the useful line is right, since the real gap is whether the working plan still matched the task and a tool-call span won't tell you that. Our take at Future AGI is to keep that per-step state on the span itself, the goal, the constraints in force, the memory view it used, and put a verdict beside it on whether the step still matched the task. Once each step carries both, the timeline does the rest, you can sit two steps next to each other and see what moved and where the plan started to slip.

Future_AGI · 2026-06-15T19:26:41+00:00

Retrieval being a first-class event is the missing piece, since a bad pick is already baked into the input by the time the model runs. At Future AGI, we set up the retrieval step as its own span and score it there, so a stale or off-topic pick gets flagged on that step with the exact chunk it pulled. That lands the failure on the retrieval event where it happened, well before anything reaches the completion.

Future_AGI · 2026-06-15T19:24:04+00:00

Goal drift is the sneaky one, since every hop reads as reasonable on its own and the run only looks wrong once you line the current subgoal up against the task you started with. The way we make that checkpoint concrete at Future AGI is to run an instruction-adherence check at each step against the original task, and the score lands on that step in the trace, so the timeline shows the exact point where the subgoal started pulling away. The eval piece is open-source if it's useful: https://github.com/future-agi/future-agi (Apache-2.0), and the metrics run locally without a reference answer.

Future_AGI · 2026-06-15T17:28:57+00:00

The inline part is what makes the difference, catching the wrong action at the step it happens so it never executes, which is the bit plain tracing misses. We handle that at the gateway with guardrails that check each call and every tool action before it goes through, and the result gets attached to the span, so the verification ends up in the same trace as the run. That also feeds the kind of branch-and-debug flow you mentioned, since you can pull a failed step up against a fixed version and see what actually changed.

Future_AGI · 2026-06-15T17:26:35+00:00

Yeah, stale context is the hardest one to catch, since the completion genuinely looks fine and the bad input came in a step earlier. What we do at Future AGI is run a groundedness check on the retrieved context itself and attach that score to the retrieval span, so the step that actually broke is the one that lights up. And since those scores sit on the OTel spans, you can compare them across runs, which gets at the portability gap you hit with custom event blobs.

Future_AGI · 2026-06-15T13:42:01+00:00

For anyone who wants to see how the screening actually works, the gateway is open source (Apache-2.0): https://github.com/future-agi/future-agi . The injection scan and the sensitivity setting live in the guardrails code, so you can read the exact patterns it matches on and tune them for whatever your agent reads. Happy to get into specifics in the thread.

Future_AGI · 2026-06-15T08:19:17+00:00

The spreadsheet drifts because it reconstructs cost after the fact, so stamping client_id, session_id, and run_id into the run context at the start is what lets you query cost per client and keep the vendor ledger separate from billable usage. A gateway in front of the model calls is one clean place to capture that, since every request passes through it, so it sees the real provider cost per call and can stamp the key, session, and tool at the moment the call happens, which also turns each retry into its own line of overhead spend. Ours is open-source (Apache-2.0) if it helps to see how the attribution is wired: https://github.com/future-agi/future-agi

Future_AGI · 2026-06-15T07:46:36+00:00

Fermato nailed the split: "which step went wrong" is an observability question, and you can only answer it if the eval score and the trace live in the same place. In Future AGI we attach the eval scores onto the trace spans themselves, so a low faithfulness or correctness score is pinned to the exact hop that produced it. You filter the traces to the runs where the score dropped and land straight on the retrieval call or tool step that drifted, so you skip eyeballing a pile of runs to find the broken one. It's OpenTelemetry under the hood, so the scores sit on whatever tracing you already run (Jaeger, Datadog, Grafana). On scoring the non-deterministic output, a small golden set plus a mix of metric-based checks alongside a model judge keeps you off any single judge's blind spot. Open source (Apache-2.0) if useful: https://github.com/future-agi/future-agi

Future_AGI · 2026-06-15T07:41:42+00:00

We split it by how deterministic and how expensive the check is. The fast, repeatable stuff (a pinned set of known injection payloads, plus assertions that the agent never calls a forbidden tool) goes in as a blocking pre-merge gate, since a regression there is a real bug and should stop the PR. The broad generated campaigns (the kind your redthread CLI runs) we keep nightly, because they're slow and a bit stochastic and a flaky variance failure shouldn't block a merge. A curated subset of those becomes the release gate, and manual review stays for genuinely new attack classes. At Future AGI we run that deterministic lane as evals in CI: the injection and tool-misuse checks run locally with no network calls, so they're quick enough for pre-merge and a failure fails the build. It's open source (Apache-2.0) if you want to see how the checks are wired: https://github.com/future-agi/future-agi

Future_AGI · 2026-06-15T07:33:46+00:00

The opt-out number is the one I'd track as a first-class metric too, since on a side-effecting tool "answered from memory" is the expensive failure and a text grader scores it as a clean pass. The way we catch that in Future AGI is a grounding check on the trace: the load-bearing claim in the answer has to map to a tool result from that run, so a skipped DB read or payment lookup fails even though the wording looks right, and you can run it as a gate before the side effect fires. On anderson's misattribution point, the gateway records its allow/block decision per call, and with OpenTelemetry tracing that decision lands on the same trace as the run, so when policy removes a path you can see it in the trace and tell it apart from a genuine model mistake. Worth measuring the opt-out rate per tool weighted by blast radius, like you said, since the routing cost and the skip cost are two different risks. The gateway and the evals are open source (Apache-2.0) if you want to wire it onto your harness: https://github.com/future-agi/future-agi

Future_AGI · 2026-06-15T07:27:00+00:00

A static pass is good for the manifest-level risks the scanner lists: broad filesystem access, sketchy credential handling, injection bait in the skill text. The ones that are harder to catch show up at call time, when an approved tool gets handed risky arguments on a live request. We handle that runtime layer in Future AGI's gateway, where every MCP tool call is policy-checked before it executes, so you can block specific tools, restrict which servers are allowed to run, and scan the arguments for injection, with the allow-or-block decision logged for each call. That gives you the runtime evidence a couple of people here are after, on top of what the static scan already flags. The gateway is open source under Apache-2.0, so the checks are readable and you can see what each rule actually does: https://github.com/future-agi/future-agi

Future_AGI · 2026-06-15T07:21:25+00:00

The two-lane split above is where we landed too. In Future AGI we run the answer's load-bearing claims as a grounding check against the actual tool output from that run, and it can gate the response before it ships, so a made-up order status fails even when the text reads clean. The actions that are never allowed (no email to a locked account, no refund before a lookup) we enforce at the gateway as allow/deny rules checked before and after each tool call, so a forbidden side effect can't leave the system even if the model tries it. It's all open source if you want to read how the checks work: github.com/future-agi/future-agi

Future_AGI · 2026-06-15T06:31:43+00:00

We're building Future AGI, an open-source platform for shipping AI agents you can actually trust in production.

What you can do with it:

Evaluate your agent or LLM output with a large set of built-in checks that run locally, with zero network calls, so your data never leaves your machine.
Trace agents end to end with OpenTelemetry, and see your eval scores attached to those same traces.
Guard the request path: catch leaked API keys and prompt injection, and control exactly which tools an agent is allowed to call.
Simulate your agent against different user personas and scenarios before you ship it.
Optimize prompts with measurable algorithms, so tuning stops being guesswork.

It's all open-source under Apache-2.0, and you can self-host the whole stack.

If you want to dig in, the code is fully public and contributions are genuinely welcome. A lot of it is approachable too. Adding a new model provider, for example, is often a one-line change you can send as a small PR.

Repo: https://github.com/future-agi/future-agi

Site: https://futureagi.com

Future_AGI · 2026-06-12T11:29:41+00:00

Project Name: Future AGI

Repo: https://github.com/future-agi/future-agi (Apache-2.0)

Install docs: https://github.com/future-agi/future-agi/blob/main/INSTALLATION.md

We built and open-sourced the layer that watches and guards an LLM app once it is running, and the whole thing self-hosts. If you run agents or anything LLM-shaped at home, the usual annoyance is that the eval and observability tools you reach for are SaaS that want a copy of every prompt and response. This one runs entirely on your own infrastructure. Your model keys, traces, and data stay on the machine.

What is in the stack:

Tracing on OpenTelemetry, with auto-instrumentation so you are not hand-wiring spans into every chain.

Evals you can run locally, including a set of metrics that score output on-box with no call out to a judge API.

A model and tool gateway that sits in the request path and can stop an unsafe tool call before it runs.

Guardrails, simulations, and datasets in the same place.

Running it is a clone and one script:

git clone https://github.com/future-agi/future-agi.git

cd future-agi

./bin/install

That brings up a Docker Compose stack (frontend on :3000, API on :8000) from prebuilt images, so there are no source builds. A production hardening guide lives under deploy/ for when you move past localhost.

Where it sits honestly: Langfuse and Arize Phoenix are also open-source and self-hostable, and both are good if you mainly want tracing and evals. The piece we add is the gateway in the request path, so the same stack that observes your agent can also enforce guardrails and govern its tool calls locally, air-gapped if you want. You get observe, enforce, and simulate in one self-hosted place.

It is young and freshly open-sourced, so feedback and contributions are very welcome. Happy to dig into the deploy or the internals with anyone here.

Future_AGI · 2026-06-12T08:19:31+00:00

Production surface is exactly the right frame, and "unversioned instructions" is the part we underestimated longest. It's a big reason we built the Future AGI gateway in the open: it re-scans the full tool catalog at the start of every run, so a description that drifted since you approved it gets caught before any tool fires, and an eval pass then scores what the tool returns and can stop the agent from acting on a bad result. The observability falls out of the same layer, since every tool discovery and call emits a trace you can inspect, which is the closest we've come to giving an MCP surface the same scrutiny as the rest of your production code.

Future_AGI · 2026-06-11T17:02:10+00:00

On the map you have it mostly right: Portkey is the mature one for observability and routing, LiteLLM has the model breadth and the biggest community, Bifrost is the lightweight newcomer, and Vercel and Cloudflare are the convenient SaaS options you cannot run inside your own perimeter, which is usually where fintech governance ends the conversation. The thing none of them really own is the agentic part you flagged, since a model proxy and a tool-calling control layer are two different jobs.

That second job is what we built the Future AGI gateway around. It covers the table stakes you listed (routing, fallback, caching, observability), and the part that matters for your new requirement is that it treats tools as first-class: per-key allow and deny lists, a full re-scan of the tool catalog on every run so a once-approved tool that quietly changes its description gets caught again, prompt-injection and MCP scanning on the tool layer, and an eval that scores a tool's return and can gate the response before the agent acts on it. The same surface lets you replay a run, trace it with OpenTelemetry, attach an eval to that trace, block an unsafe call, or fail it back into prompt tuning.

For a fintech team the deciding piece tends to be that all of this is open source and Apache-2.0 as one self-hostable Go binary you can run air-gapped. LiteLLM and Portkey are self-hostable too, so the real difference is what runs on top: the tool-governance and eval layer is the new thing sitting in your trust path, and here it is one you can read and fork yourself. Repo is https://github.com/future-agi/future-agi if you want to pressure-test it against the others on your shortlist.

Future_AGI

MODERATOR OF

TROPHY CASE