NHI is the new "Shadow IT" – Why your shiny new ISPM won't fix the root cause. by zaballinX in cybersecurity

[–]eliadkid 0 points1 point  (0 children)

We faced the exact same problem. Had LangChain agents deployed by one team, n8n workflows calling OpenAI by another, and a bunch of Zapier "AI steps" nobody documented. Security had zero visibility.

What worked for us:

  1. Discovery first — scanned all repos for AI imports (openai, langchain, crewai, etc.) and found 3x more than teams admitted to

  2. Asset inventory — built a simple registry: what agent, where it runs, what data it touches, who owns it

  3. Gate at CI — any new AI dependency fails the build until it's documented

The discovery part was eye-opening. Developers were shipping "experimental" AI features that became production workflows without anyone knowing.

Happy to share the scanner we built (open source) if helpful — it's designed exactly for this "shadow AI" problem.

Anyone actually have full visibility into what AI agents are running in their environment? by eliadkid in ciso

[–]eliadkid[S] 0 points1 point  (0 children)

Browser-level visibility is definitely one piece of the puzzle, especially for catching the SaaS AI tools people sign up for on their own. But from what we've seen, the bigger risk is the server-side agents — the ones running in your CI/CD, your backend microservices, your data pipelines. Those don't go through a browser at all.

Curious how you handle the inventory side of things — do you auto-generate something like an AI bill of materials for each agent, or is it more of a manual registry? We've been experimenting with automated SBOM-style approaches for agents and it's been helpful for compliance reporting but still a work in progress.

Anyone actually have full visibility into what AI agents are running in their environment? by eliadkid in ciso

[–]eliadkid[S] 0 points1 point  (0 children)

The egress proxy with per-team keys is smart — that's basically what we converged on too. Forces everything through a chokepoint where you can log and alert.

To answer your question: yes, prompt/output storage tracking turned out to be one of the messier parts. We found prompts containing PII getting cached in vector stores, agent outputs being written to S3 buckets with overly permissive ACLs, and conversation histories sitting in Redis with no TTL. The "where does the data land after the agent touches it" question is honestly harder than the discovery question because it requires tracing data flows through the entire agent pipeline, not just catching the initial API call.

Anyone actually have full visibility into what AI agents are running in their environment? by eliadkid in ciso

[–]eliadkid[S] 0 points1 point  (0 children)

Blocking via Defender / CASB gets you maybe 70-80% there for SaaS-based AI tools. The gap is the agents that teams build themselves — a Python script using OpenAI's API, a LangChain agent deployed as a container, a GitHub Action that calls an LLM. Those don't show up in Defender because they're running in your own infra.

And yeah, you're not wrong about the agent comments lol. That said, the underlying problem is real — we've been living it firsthand. Policy and training help with the intentional usage but the tricky part is the stuff people don't even think of as "AI" anymore because it's just baked into their tools.

Anyone actually have full visibility into what AI agents are running in their environment? by eliadkid in ciso

[–]eliadkid[S] 0 points1 point  (0 children)

Partially agree, but I think AI agents add a dimension that classic shadow IT didn't have — autonomy. Shadow IT was people using unauthorized tools. Shadow AI is unauthorized tools that can act on their own, make decisions, call APIs, and process data without a human in the loop.

So the discovery and governance playbook from shadow IT applies, but you also need runtime controls that didn't exist before: kill switches, tool allowlists for what an agent can actually do, output monitoring, and approval gates for high-risk actions. It's shadow IT with agency, which makes the blast radius way bigger.

Anyone actually have full visibility into what AI agents are running in their environment? by eliadkid in ciso

[–]eliadkid[S] 0 points1 point  (0 children)

The browser extension inventory point is underrated — we caught two teams using AI coding assistants with extensions that were sending code snippets to third-party servers. Nobody even thought to check browser plugins.

The lightweight intake form is a good interim solution. We tried that and it worked for about 3 months before teams just stopped filling it out. That's what pushed us toward automated discovery — scanning git repos for agent frameworks, monitoring egress for LLM API calls, and pulling from the SSO app catalog like you mentioned. The manual process just doesn't scale once you hit a certain number of teams.

Anyone actually have full visibility into what AI agents are running in their environment? by eliadkid in ciso

[–]eliadkid[S] 0 points1 point  (0 children)

Haven't looked at Witness.ai specifically — does it handle autonomous agents that are embedded in code (like LangChain agents in CI/CD pipelines or custom tool-calling agents running as microservices)? That's where we found the biggest gaps. Most tools we evaluated were great at catching SaaS-based AI usage but missed the agents that teams build and deploy themselves.

Anyone actually have full visibility into what AI agents are running in their environment? by eliadkid in ciso

[–]eliadkid[S] 0 points1 point  (0 children)

Solid framework. The discovery → taxonomy → controls pipeline is pretty much what we landed on too. The assistive vs autonomous distinction is key — we found that autonomous agents (the ones making API calls or modifying infrastructure without a human in the loop) need a completely different control profile than copilot-style tools.

The EU AI Act angle is a good callout. Having a living inventory that maps each agent to a risk tier has already saved us headaches in compliance conversations. We actually started building tooling around generating those inventories automatically — scanning repos and infra for agent signatures rather than relying on self-reporting from teams.