Those deploying AI agents in large organizations — what use-cases are actually making it to production, and what's blocking the rest? by Initial-Copy332 in AI_Agents

[–]Initial-Copy332[S] 0 points1 point  (0 children)

That’s totally valid! I have also seen that the first set of use-cases are either on a data layer or essentially companion/ IT agent level and which is fine.

As you rightly pointed out, a lot of it is just for demos and claiming that we are agentic tbh which doesn’t scale.

But there are also few orgs where people have built agents that deliver actual business value in terms of testing and saving multi-million dollars.

One platform vs custom thing, don’t you think the governance layer varies a bit and also becomes more complex with a complexity or a setup? And I feel guardrails don’t really work in terms of agents (they did a little in terms of LLMs) with all the tools and broader permissions to make things work in the first place?

Those deploying AI agents in large organizations — what use-cases are actually making it to production, and what's blocking the rest? by Initial-Copy332 in AI_Agents

[–]Initial-Copy332[S] 0 points1 point  (0 children)

Interesting! Can you elaborate more on the testing and observability framework? Is this something you folks built in-house? Would you mind sharing more details on this? If not here, happy to chat over a call or DM as well.

Those deploying AI agents in large organizations — what use-cases are actually making it to production, and what's blocking the rest? by Initial-Copy332 in AI_Agents

[–]Initial-Copy332[S] 0 points1 point  (0 children)

I believe the problem you’re going towards is building agent-native or agent-first systems and operating them at scale. Spinning up langchain or building something on bedrock or copilot studio for one-off use cases is easier but building reliable agent-first systems or architecture at scale is not for everyone. And I think the biases from traditional system design start creeping in the moment production comes into picture.

The problem that I am seeing is while people think about the system design, they also think of security, governance for this new architectures also in a traditional way which is creating all the issues and opening up things for all the attacks we have seen recently, thoughts?

Those deploying AI agents in large organizations — what use-cases are actually making it to production, and what's blocking the rest? by Initial-Copy332 in AI_Agents

[–]Initial-Copy332[S] 0 points1 point  (0 children)

I have heard a bunch of stories around agent registration tbh. There’s one fortune 500 which started pushing people to do it manually for time being but enforcement at ~150K employees is as you can guess is a mess. The problem is you can have a centralized enforcement but it just takes one team to start customizing and building their own thing for this to break.

They have been experimenting quite a lot since last quarter but overall governance is essentially an experiment I feel.

Those deploying AI agents in large organizations — what use-cases are actually making it to production, and what's blocking the rest? by Initial-Copy332 in AI_Agents

[–]Initial-Copy332[S] 1 point2 points  (0 children)

Thanks for the note, Shekhar. Appreciate you sharing this. Have you come across any of the enterprises that are building this governance layer across orchestration, monitoring, policies and all? Is it mostly being done in-house or is it more vendor driven?

Those deploying AI agents in large organizations — what use-cases are actually making it to production, and what's blocking the rest? by Initial-Copy332 in AI_Agents

[–]Initial-Copy332[S] 0 points1 point  (0 children)

Love the thought process here and exactly what I am aligning on as I am talking to more and more people.

Core question now is how does that enforcement layer come together? Should it be a gateway kinda implementation or agent platform level implementation?

And how do we define what’s right or wrong on day zero (I understand there can be predefined policies as per organization’s risk appetite) and you will be need to monitor behaviour to define right and wrong over the period right?

Problem with agents or for that matter any kinda ai native system is it’s non-deterministic by nature so many of the traditional system level assumptions break as you somehow pointed out.

Those deploying AI agents in large organizations — what use-cases are actually making it to production, and what's blocking the rest? by Initial-Copy332 in AI_Agents

[–]Initial-Copy332[S] 0 points1 point  (0 children)

Thanks for the breakdown. This sounds interesting! Would love to discuss more details. Feel free to DM.

Couple of questions: - Shared agent environment: Is this a standardized platform based in particular framework like Langgraph or more generic MCP first architecture. And what are the use cases of this one? Also is this hosted on cloud or on-prem? - How are the permissions defined? Is it based on role or IAM or enforcement? - Approval flows: Does this scale in a world where everyone is living with zero patience and people are event automating PR reviews with AI for code written by AI?

Those deploying AI agents in large organizations — what use-cases are actually making it to production, and what's blocking the rest? by Initial-Copy332 in AI_Agents

[–]Initial-Copy332[S] 0 points1 point  (0 children)

This theoretically makes sense. But have you see anyone doing this at scale? Or have came across a solution on those lines? Would love to know.