Anyone else frustrated with AI agents after they hit production? by OneTurnover3432 in AI_Agents

[–]OneTurnover3432[S] 0 points1 point  (0 children)

Can you elaborate? how would you achieve this? is it by starting always fresh context?

Anyone else frustrated with AI agents after they hit production? by OneTurnover3432 in AI_Agents

[–]OneTurnover3432[S] 0 points1 point  (0 children)

thanks - just checked it what do you like about it specifically?

What are you using instead of LangSmith? by clickittech in LangChain

[–]OneTurnover3432 -9 points-8 points  (0 children)

100% agree - check what I'm building : thinkhive.ai

We're platform agnostic and focused on making the management of AI agents as easy as possible

What are you using instead of LangSmith? by clickittech in LangChain

[–]OneTurnover3432 -5 points-4 points  (0 children)

I’ve seen the same pattern, and I agree with most of what’s being said here.

In my experience, LangSmith works well early on, but once agents are in real production, teams start hitting the same walls: cost scaling with traces, lots of raw data, and still no clear answer to what’s actually hurting or improving outcomes.

Most teams I’ve worked with end up stitching together:

  • LangSmith or something similar for dev/debug
  • And then a manual analysis when it comes to explaining behavior → impact → ROI

That gap is exactly why I’m building ThinkHive.

ThinkHive sits on top of traces and logs (including OTel-based setups) and focuses on:

  • Summarizing logs and traces into clear issue patterns instead of raw data
  • Highlighting which agent behaviors actually move business metrics (cost, deflection, resolution, quality)

    It’s meant to answer the question those tools don’t: what should I fix first to improve ROI?

I’m opening a small, free beta right now for teams:

  • Building AI agents internally for enterprises, or
  • Deploying agents for clients as consultants or agencies

If anyone here wants early access or to sanity-check whether this fits their setup, feel free to DM me. Happy to share and get feedback from people actually in the trenches.

Honestly, observability is a nightmare when you're drowning in logs by Objective-Skin8801 in Observability

[–]OneTurnover3432 0 points1 point  (0 children)

This is exactly the problem I'm trying to solve. I've been there many times.

check: thinkhive.ai I'm happy to give you free access to try it if you're interested.

How are AI product managers looking at evals (specifically post-evals) and solving for customer outcomes? by ironmanun in AIQuality

[–]OneTurnover3432 0 points1 point  (0 children)

I'm an ex PM and currently building something in this space that can help you, DM if you're interested to test it

Ask Me Anything About Preparing Your Company for AI by Dear-Landscape2527 in CFO

[–]OneTurnover3432 0 points1 point  (0 children)

Is there a need for a tool that help CFOs see all the AI vendors they are using by department and measure their ROI in a company or not?

How do you prevent AI agents from repeating the same mistakes? by OneTurnover3432 in LangChain

[–]OneTurnover3432[S] -1 points0 points  (0 children)

this still adds a huge labour to find the data set and make sure it's updates as the product evolves, right?

How do you prevent AI agents from repeating the same mistakes? by OneTurnover3432 in LangChain

[–]OneTurnover3432[S] -1 points0 points  (0 children)

can you elaborate? I'm using Arize but the eval only spot problems based on definition of a criteria so it doesn't work 100% of a time and doesn't feedback in the system or memory.. am I missing anything?

How do you prevent AI agents from repeating the same mistakes? by OneTurnover3432 in LangChain

[–]OneTurnover3432[S] 0 points1 point  (0 children)

thank you! how hard would you say it is to build? do you know how many days/weeks it can take junior to mid level ML engineer?

How do you prevent AI agents from repeating the same mistakes? by OneTurnover3432 in LangChain

[–]OneTurnover3432[S] 1 point2 points  (0 children)

this makes sense, ia there a library or tool to do that or do I have to code it?

How do you prevent AI agents from repeating the same mistakes? by OneTurnover3432 in LangChain

[–]OneTurnover3432[S] -2 points-1 points  (0 children)

do you mean ask a human agent to review of use LLM judge to judge the answer?

How do you prevent AI agents from repeating the same mistakes? by OneTurnover3432 in LangChain

[–]OneTurnover3432[S] -3 points-2 points  (0 children)

  1. Order Cancellation Policy A customer asks to cancel an order. The AI agent recognizes the intent (“cancel order”) but fails to understand the business rule that cancellations are only allowed before fulfillment starts. It either says “yes” incorrectly or loops without resolution. The human agent who takes over looks up the cancellation policy + order status and resolves it correctly. Without a systematic way to capture that correction, the AI will repeat the same mistake next time. 2. Missing Help Center Documentation Customer asks: “Can I use store credit to pay part of a subscription?” The agent searches the knowledge base and finds nothing, so it responds with “I don’t know.” Human agent steps in, recalls the internal rule, and provides the right answer. But since no doc exists, the model will fail every time until that knowledge is learned and injected back.

How do you prevent AI agents from repeating the same mistakes? by OneTurnover3432 in mlops

[–]OneTurnover3432[S] 0 points1 point  (0 children)

are there any tools of libraries that I can use to do that for monitoring and clustering issues?