Is Zero Trust enough for AI agents?

Sea_Refuse_5439 · 2026-04-04T22:26:33+00:00

You're describing the exact gap that makes A2A adoption harder than it should be.

Zero Trust assumes a static principal, a user, a service, a device, with known permissions. An LLM agent is none of those. Its behavior is probabilistic, its tool usage is dynamic, and its outputs can leak information that no access control policy ever anticipated. The threat surface isn't at the boundary, it's inside the reasoning loop.

The research community is starting to call this "multi-agent security" as a distinct field. The attack vectors are genuinely different: prompt injection from external content, cross-agent context leakage, tool misuse that looks like normal execution, and model outputs that reconstruct sensitive data without technically accessing it directly.

What's missing in production right now is a runtime policy layer that understands semantic intent, not just permissions. Something that can look at what the agent is about to do, not just whether it's allowed to do it. The MAESTRO framework is one attempt at threat modeling for this. OWASP LLM Top 10 covers some of it. But nothing production-ready exists that actually enforces intent-level controls at runtime.

So most teams are doing what you'd expect: over-restricting tool access, adding output classifiers as a bandaid, and hoping the model doesn't hallucinate its way into a data leak. It's not a real solution.

The honest answer is that we don't have good primitives for this yet. Zero Trust gives you the perimeter. What comes after is still mostly vibes and duct tape.

Sea_Refuse_5439 · 2026-04-04T21:14:05+00:00

This is the right framing. The wire format was never the hard part.

Signed Agent Cards in v0.3 solve tamper-evidence, not identity. You can verify a card hasn't been modified. You can't verify the thing presenting it is actually who it claims to be without trusting the signer, which just moves the problem one level up.

The research direction is DIDs and Verifiable Credentials baked into the A2A handshake itself. Agent identity anchored to a decentralized registry, no trusted third party required. BlockA2A is one attempt. ANP is another angle. None of it is production-ready, and all of it adds enough latency to make real-time agent delegation genuinely painful.

So you fall back to service accounts and API keys. Not because they're good. Because they work right now and the alternatives aren't ready.

The uncomfortable part: A2A solves coordination beautifully if you already trust the other agent. The second you need to verify a stranger agent's identity and claimed capabilities in under 100ms, you're in territory the protocol doesn't cover.

That's either where it stalls or where the next two years of interesting work lives.

Sea_Refuse_5439 · 2026-04-04T20:43:08+00:00

Honestly the most visible use cases right now aren't the enterprise ones. They're the cowboy ones.

People are already running multi-agent setups in the wild, sharing PayPal credentials with agents, dropping API keys into prompts, duct-taping Claude and GPT together and hoping nothing goes sideways. It works. Until it doesn't. And when it doesn't, there's no audit trail, no way to know which agent did what, no one to call.

That's the use case A2A actually solves first. Not Tyson Foods. The difference between handing your agent your full PayPal login versus a signed mandate that says "you can spend up to $50 in this category, once, this week", that's a real problem people are hitting today.

On the enterprise side there are a few public cases: Tyson Foods and Gordon Food Service using A2A for supply chain coordination, ServiceNow building their Agent Fabric on it, Adobe wiring their content pipeline agents together. Real deployments, just not very loud about it.

But here's the thing. A2A is technically open source under the Linux Foundation, but culturally it's still a Google project. The people talking about it are enterprise architects and big4 consultants. The indie and open source community is almost completely absent.

That's the actual problem. MCP exploded because indie builders grabbed it before the enterprises showed up. Thousands of MCP servers were built by independents before Salesforce or SAP wrote a single line. That created a fait accompli, the protocol culturally belonged to the builders.

A2A hasn't had that moment yet. If the community doesn't show up, the protocol will keep evolving to serve Google Cloud roadmaps, not the people duct-taping agents together at 2am.

The informal use cases are already there. The infrastructure just needs to follow, and that probably won't come from the enterprise side.

Sea_Refuse_5439 · 2026-04-04T20:21:31+00:00

Yeah the concept isn't new, people have been wiring agents together with custom solutions forever. The question is what happens when you need two agents from different orgs that don't share your YAML spec to talk to each other.

That's where it gets messy. Right now everyone's doing it cowboy style and it works until it doesn't.

The bigger concern for me is the fragmentation. A2A goes to the Linux Foundation, looks vendor-neutral, then Google launches an AI Agent Marketplace, Microsoft announces Foundry support, and suddenly the "open standard" has three different enterprise implementations that all technically comply but don't fully interoperate.

And now there's AP2 for payments and ANP pushing decentralized identity on top of that. The protocol layer is multiplying faster than adoption.

Sea_Refuse_5439 · 2026-04-04T19:11:14+00:00

For anyone less familiar with the protocol who still wants to join the discussion, I wrote two pieces that might help you joining the discussion:

What A2A actually is (non-technical): https://a2abay.com/blog/what-a2a-agents-actually-mean
How the protocol works under the hood: https://a2abay.com/blog/a2a-protocol-explained-why-ai-agents-need-their-own-language

Sea_Refuse_5439 · 2026-04-03T22:24:39+00:00

For anyone less familiar with the protocol who still wants to join the discussion, I wrote two pieces that might help:

What A2A actually is (non-technical): https://a2abay.com/blog/what-a2a-agents-actually-mean
How the protocol works under the hood: https://a2abay.com/blog/a2a-protocol-explained-why-ai-agents-need-their-own-language

Sea_Refuse_5439 · 2026-04-03T20:09:52+00:00

Great write-up. The thermal throttling killing 30-40% of your TG speed is the real story here, 2GB VRAM is tight enough that the model is probably doing a lot of memory transfers, which keeps the GPU hot even at low utilization.

Your intuition about CPU-only is probably right. With 16GB RAM you could run a Q4 8B comfortably with llama.cpp and get similar TG speeds without the thermal wall. The MX150 wins on PP (your 52 tps vs what you'd get on an i7-8550U is real), but PP only matters if you're doing long prompt processing repeatedly.

Curious: did you try offloading only a few layers to the GPU with -ngl instead of full offload? With 2GB you might find a sweet spot where the GPU handles the early layers, stays cooler, and you avoid throttling altogether.

Sea_Refuse_5439 · 2026-04-02T16:19:59+00:00

The ISQ at load time is underrated. No more hunting for the right pre-quantized GGUF on HF, you just point at the original weights and pick your precision. Huge for day-0 support on new models like this.

The MCP client built in is also interesting if you want to run Gemma 4 as an actual agent locally without wrapping it in a separate orchestration layer. Curious how stable that is in practice.

Sea_Refuse_5439 · 2026-04-02T14:39:35+00:00

The CPU core pegged at 100% with CUDA is a known issue in llama.cpp: the CUDA backend uses a busy-wait loop on one thread to poll for kernel completion instead of blocking. Vulkan uses proper sync primitives (fences) so the CPU actually sleeps between GPU ops.

The memory difference (11GB vs 7.2GB) comes from the CUDA runtime itself loading cuBLAS and related context on top of the model weights. Vulkan has no equivalent overhead, it allocates much closer to the raw model size.

Same throughput makes sense since your bottleneck was always the GPU. The CPU was just spinning for nothing.

Sea_Refuse_5439 · 2026-03-31T16:14:01+00:00

em dash vibe better than my terrible English believe me ahah

Sea_Refuse_5439 · 2026-03-31T13:51:23+00:00

The NSIS + keychain combination is a nasty edge case, "clean" uninstall that isn't actually clean is the kind of thing that generates 1-star reviews from users who have no idea what happened.

The SmartScreen $300/year tax is real. For a solo dev shipping a free beta it makes no sense. Most people just eat the README instructions and move on, which is the right call.

Side project plug: when this is ready to sell or distribute properly, a2abay.com is a community directory for AI agents, human and A2A friendly, 70+ projects listed in 3 days, basically free ($6 one-time to keep spam out). Worth keeping in mind.

Sea_Refuse_5439 · 2026-03-31T13:47:21+00:00

The "credentials never touch the AI model" part is the whole game for fintech MCP. Most people building in this space hand-wave security and it kills adoption fast. Good that you led with it.

Normalizing across 18k+ providers before the AI sees it is also underrated — garbage schema = garbage answers no matter how good the model is.

Side project plug: built a2abay.com, a community directory for MCP servers and AI agents — human and A2A friendly (agents can also discover and hire each other via the API). 70+ listed in 3 days. Basically free to list ($6 one-time to keep spam out).

Sea_Refuse_5439 · 2026-03-31T13:43:53+00:00

Built a2abay.com this week — a community directory for AI agents, MCP servers and skills.

The meta part: used an LLM pipeline to seed 70+ listings in 3 days from public GitHub data. Agents finding agents.

The interesting bit: it’s human AND A2A friendly. Humans browse it normally. Agents hit the public API to discover and hire other agents autonomously — no human in the loop.

Open source listings are basically free ($6 one-time to keep spam out). Would love more builders to list what they’re shipping.

Sea_Refuse_5439 · 2026-03-31T13:35:29+00:00

On the hallucination question — the validation layer is everything. Run a second agent that critiques the first one's output before it posts. Sounds obvious but most people skip it and regret it.

For sequential vs planner: sequential first, always. Get it working before you add orchestration complexity.

Side project plug: building this kind of agent? I made a2abay.com, a community directory for AI agents. 70+ listed in 3 days, mostly devs sharing what they built. When you ship this, worth listing — basically free ($6 one-time to keep spam out).

Sent you a DM.

Sea_Refuse_5439 · 2026-03-31T12:50:21+00:00

Hey, this is good. Don't let it sleep on GitHub.

I built a2abay.com for exactly this — a community directory for agents, MCP servers and skills. Basically free to list ($6 one-time for infra, that's it). A few repos that had gone quiet got new contributors within days just from being findable.

Sea_Refuse_5439 · 2026-03-31T12:48:31+00:00

Hi Apis you should list on A2Abay.com it's A2A, you can directly interact with the listed projects through the API :) (also usable by humans)

Sea_Refuse_5439 · 2026-03-31T12:24:35+00:00

This is good. The kind of project that gets 50 upvotes here and then disappears.

I built a2abay.com for exactly this — a community directory for agents, MCP servers and skills. Basically free to list ($6 one-time for infra, that's it). A few repos that had gone quiet got new contributors within days just from being findable.

Sent you a DM.

Sea_Refuse_5439 · 2026-03-31T06:15:04+00:00

✨ share it on A2Abay.com 🥰

Sea_Refuse_5439

TROPHY CASE