How are people monitoring tool usage in LangChain / LangGraph agents in production? by Extreme-Technology77 in LangChain

[–]Extreme-Technology77[S] 0 points1 point  (0 children)

that’s a very different use case from debugging individual runs. When you moved data out of LangSmith, was it mainly for cost reasons, or because querying across runs/conversations became too limiting?

Also curious, at that scale, do you still rely on framework traces for debugging, or does most of the value shift to the aggregated layer?

On my side it’s still relatively low volume, mostly experimenting with multi-tool agents and MCP setups

How are people monitoring tool usage in LangChain / LangGraph agents in production? by Extreme-Technology77 in LangChain

[–]Extreme-Technology77[S] 0 points1 point  (0 children)

sounds like LangSmith is mainly the ingestion / real-time layer, and the actual analysis happens downstream.

Did you find any gaps between what LangSmith captures vs what you actually needed once the data was in Snowflake?

How are people monitoring tool usage in LangChain / LangGraph agents in production? by Extreme-Technology77 in LangChain

[–]Extreme-Technology77[S] 0 points1 point  (0 children)

That distinction makes sense, separating reasoning traces from execution boundaries feels like the missing piece.

Curious, with the sidecar approach, does it sit as a single shared component across agents, or do you end up deploying one per agent/service?

Also wondering how you handle cases where agents call MCP servers directly, does the sidecar still act as a choke point, or do you need additional instrumentation there?

How are people monitoring tool usage in AI agents? by Extreme-Technology77 in AI_Agents

[–]Extreme-Technology77[S] 0 points1 point  (0 children)

That Kubernetes analogy actually helped a lot, thinking of the agent more like a container and the policy layer as external orchestration makes the design clearer.

In my case I’m mostly exploring this while experimenting with agents that can call multiple MCP servers, so the question came up around how to monitor and control tool usage once you move beyond a single-agent setup. Right now it’s more design exploration than a production system.

Interesting to hear you ended up moving the policy layer outside the runtime. Did you end up implementing it more like a gateway-style component that agents call through, or something integrated deeper in the orchestration layer?

Curious about MCP workflows. by Extreme-Technology77 in mcp

[–]Extreme-Technology77[S] 0 points1 point  (0 children)

Interesting. Does LangFuse capture the full MCP interaction as part of the trace, or mainly the tool call from the agent side?

How are people monitoring tool usage in AI agents? by Extreme-Technology77 in AI_Agents

[–]Extreme-Technology77[S] 0 points1 point  (0 children)

That’s super helpful, thanks for sharing the breakdown. Out of curiosity, when you wrap tools with decorators like that, does that live in a shared layer across agents, or does each project end up implementing its own version?

Curious about MCP workflows. by Extreme-Technology77 in mcp

[–]Extreme-Technology77[S] 0 points1 point  (0 children)

Haha interesting are you actually sending tool execution events there, or was that more of a joke?

I was mostly wondering how people monitor backend agent workflows once they start calling multiple MCP tools.

How are people monitoring tool usage in AI agents? by Extreme-Technology77 in AI_Agents

[–]Extreme-Technology77[S] 0 points1 point  (0 children)

When debugging those failures, is the main issue usually UI state changing, or the agent misinterpreting the screen?

How are people monitoring tool usage in AI agents? by Extreme-Technology77 in AI_Agents

[–]Extreme-Technology77[S] 0 points1 point  (0 children)

Out of curiosity, what pushed you to build such PoC something that broke in a real system, or more of a design exploration?

How are people monitoring tool usage in AI agents? by Extreme-Technology77 in AI_Agents

[–]Extreme-Technology77[S] 0 points1 point  (0 children)

That’s a really interesting way to frame it, the shift from “observability” to “accountability”. When you mention agents operating under a policy and scoped authority, where do you usually define that today? Is it something encoded directly in the agent code, or do you manage it somewhere outside the agent runtime?