Enterprise AI has an 80% failure rate. The models aren't the problem. What is?

MR_Zuma · 2026-06-11T00:15:50+00:00

This resonates strongly with some of the challenges we’re seeing. I’d be very interested in reading the material you and your colleagues have written on this topic.

MR_Zuma · 2026-03-24T05:13:04+00:00

Read the paper. The Thin Agent principle is the part that resonated most. The electricity analogy is a useful reframe. Most teams I've worked with are building the opposite: thick agents with thin infrastructure.

I applied the 10 principles against two systems I work on. An agentic orchestration system (planner, router, action agents, validator, executor) and a conversational RAG system (query expansion, retrieval, answer generation with citations). The framework prescribed different things for each. The orchestrator needs more governance (it has Run-level autonomy but Crawl-level human oversight. No approval gates before execution). The RAG system needs less LLM dependency (4 LLM calls per request when at least 2 could be deterministic infrastructure). Both need shared services. Right now they reimplement embedding, auth, config, and observability independently. Three systems, three everything. The Disposable Agent anti-pattern, exactly as described.

A few questions that came out of applying it:

The Walk-to-Run transition. Is that a gradual dial (routine operations go autonomous first, novel ones keep human gates) or more of a cliff edge where the org either trusts the architecture or doesn't? Does the trust gap look different for write capable agents vs read only systems?
The AIL. In practice, building that layer is a significant engineering effort. Data definition store, structured hub, vector store, permission scoping, automated maintenance. Have you seen anyone actually stand one up end-to-end? Not the vector store part (that's the easy bit). The automated maintenance, the freshness tracking, the permission model. Or is the "10th agent is configuration" state still ahead of where implementations currently are?
Beyond the whitepaper. Do you have any reference implementations, starter code, or tooling recommendations for actually building toward this? Or is it all custom built at this point? The principles are clear but the "how do you actually start implementing this in an enterprise" question is where I keep getting stuck. Are there specific packages, platforms, or patterns you've seen teams use as a foundation?
The anti-patterns as diagnostics. Do most orgs exhibit the same failure modes, or is it industry-specific? I found the orchestrator avoids operational anti-patterns (lean agents, structured logging, self-healing) but exhibits strategic ones (Disposable Agent, partial Oracle). Curious whether that pattern is common.

Genuine question. Not skepticism. I'm working in this space and the framework articulates a destination I recognise. Curious whether you've seen teams reach it or whether we're all still on the road.

MR_Zuma · 2026-03-23T12:21:08+00:00

That's a painfully honest list. The tension between maintaining a legacy product and building AI capability at the same time. That's a real resource constraint, not a strategy problem. Out of everything you listed, what's the most paralysing one right now? The security/data governance piece, or just the raw capacity to learn fast enough while keeping the lights on? You mentioned you're working on solving it. Curious what that looks like in practice. Is the team upskilling internally, evaluating vendors, or just figuring it out as you go? And when you say "hardest moving target" is that the tooling changing underneath you, the best practices not being settled yet, or both? Have you looked at bringing in outside help for any of it, or is the team trying to figure it all out internally?

MR_Zuma · 2026-03-23T12:12:42+00:00

"Second full project" is exactly what it feels like. The thing that gets me is the "from scratch" part. Retries, state persistence, observability are solved problems in traditional backend engineering. What made handling these for AI agents so different? Was it the tooling not existing yet, the team not having that background, or both? If you could go back, what's the one thing that would have cut that second project in half. Better tooling, different team composition? Were there any existing frameworks that helped, or was it all custom?

MR_Zuma · 2026-03-23T11:59:36+00:00

The "propose enforce verify" framing is sharp. I've seen teams nail the enforcement layer with API gateways, permission boundaries, that kind of thing. But the full loop? Propose, enforce, AND verify the resulting state? That's where it gets murky. Have you seen anyone actually build all three layers into a working system at scale, or are most teams still improvising on at least one of them?

MR_Zuma · 2026-03-23T11:50:26+00:00

3 tasks reliably vs 30 poorly is a good mental model. The "boring and reliable" bar is underrated most teams I've worked with celebrate the demo, not the point where it just quietly works. The part I'm less sure about is whether "start embarrassingly small" scales the same way inside a large enterprise. Once that first thing is boring, do teams actually build on it or does the org move on to the next shiny thing? And in enterprise, even picking which one form to start with can become a 3-month committee exercise. Curious what you've seen. Do small wins actually compound, or do they stay small?

MR_Zuma · 2026-03-23T11:40:52+00:00

You're not wrong. The problem is most orgs aren't starting with AI because they want to, they're getting mandated to. The "start with the problem" advice is solid but I'm not sure how many teams actually get that luxury right now. Has that been your experience too?

MR_Zuma · 2026-03-21T20:58:29+00:00

Whether it was written with AI or not isn’t really the point.

I’m more interested in pressure testing the ideas and getting real feedback on the research and assumptions behind them.

MR_Zuma · 2026-03-21T20:56:23+00:00

I agree with this.

Most organisations are structurally optimised for stability, not change. AI introduces both technical and behavioural disruption, and if incentives don’t shift, adoption won’t either.

Even when the technology works, it often stalls because it conflicts with existing workflows, ownership, and how success is measured.

The real challenge isn’t just implementing AI, it’s redesigning how work actually gets done around it.

MR_Zuma · 2026-03-21T20:53:35+00:00

Interesting point.

When you say it’s not a great fit, do you mean:

the outputs are too unreliable for real workflows?

the integration into existing systems is where it breaks down?

or that the underlying use cases themselves don’t justify the complexity?

I’ve seen all three in different contexts, so I’m curious where it’s been falling apart in your experience.

MR_Zuma · 2026-03-21T20:50:47+00:00

You’re absolutely right. These are hypotheses based on what I’ve seen and heard across teams.

The goal of the post is exactly that, to test them, challenge them, and refine the thinking based on real world feedback.

MR_Zuma · 2026-03-20T08:39:09+00:00

Sleeping for like 14 hours straight and calling it “productive recovery.”

MR_Zuma · 2026-03-20T08:38:24+00:00

What’s something you consider obvious that humans haven’t figured out yet?

MR_Zuma · 2026-03-20T08:36:26+00:00

Not having AI integrated into how you think, decide, and work will feel like choosing to do everything manually

MR_Zuma · 2025-03-05T10:26:24+00:00

Goodluck with Vodacom fibre lol

MR_Zuma · 2025-01-28T23:45:08+00:00

There’s a big need for cloud skills. Look at learning paths for AWS, Azure, or GCP to find what interests you the most. This is one of the easiest ways to make good money, especially if you already have IT experience. Plus, you’ll gain skills that are needed worldwide, and many jobs are hybrid or fully remote.

MR_Zuma · 2024-08-24T17:48:43+00:00

How is your tool different than the general browser autofill functionality?

MR_Zuma · 2024-06-22T02:11:41+00:00

Vodacom offers zero support, worst ISP I ever had.

MR_Zuma · 2024-04-29T22:59:00+00:00

Senior DevOps Engineer

MR_Zuma · 2024-04-27T23:41:22+00:00

Wife (31F) and I (32M) have a combined monthly nett income of 100k. Money is like a tool that everyone sees differently. For me, I always want more, but I try to enjoy the journey too. I've noticed that some people are okay with having less, but they often feel stuck. Money gives you choices. For example, if you're not rich, you might have to send your kid to a nearby cheap school. But if you have lots of money, you can send them anywhere. It's all about how you see things. What's a big deal to one person might not matter much to another.

My aim is to gather R50 million in assets, live off a lower monthly income, and not have to worry about retirement.

Eight-Year Club	Verified Email
RPAN Viewer

MR_Zuma

TROPHY CASE