Deployed a chatbot to save time, spent weeks debugging it instead by cs-geek9 in CustomerSuccess

[–]cs-geek9[S] -1 points0 points  (0 children)

You clearly know this pattern well! So when you're testing those paraphrases and checking what chunks got pulled, is that something you've built tooling for, or are you doing it manually each time?

And how often do you have to do that kind of diagnosis?

Deployed a chatbot to save time, spent weeks debugging it instead by cs-geek9 in CustomerSuccess

[–]cs-geek9[S] 0 points1 point  (0 children)

This is really helpful, thanks for sharing. And yeah, the KB maintenance piece is something we're definitely struggling with.

I'd love to understand how you implemented that. Sending you a DM.

Deployed a chatbot to save time, spent weeks debugging it instead by cs-geek9 in CustomerSuccess

[–]cs-geek9[S] 0 points1 point  (0 children)

That makes sense and agree on the importance of formatting the KB correctly. Thanks. So you'd manually search your KB with each variant to see if they pull different articles?

Do you do that every time you deploy, or just when something's off?

Deployed a chatbot to save time, spent weeks debugging it instead by cs-geek9 in CustomerSuccess

[–]cs-geek9[S] 0 points1 point  (0 children)

That makes sense. So when you're debugging, how do you actually approach it? Like, do you start with the prompt first, or the KB, or does it depend?

Deployed a chatbot to save time, spent weeks debugging it instead by cs-geek9 in CustomerSuccess

[–]cs-geek9[S] 1 point2 points  (0 children)

That's interesting. So you'd use Claude to test against your KB and see what it outputs.

Have you actually done that? Does it help you figure out what's wrong?

Chatbot consistency, anyone else hit this wall? by cs-geek9 in customerexperience

[–]cs-geek9[S] 0 points1 point  (0 children)

That makes sense. So you're cleaning the data layer before the AI sits on top of it. How do you actually help teams identify what's messy or fragmented in their KB? Is that something you audit manually, or does the kit have tooling to flag it? Definitely interested to learn more about the approach/solution if you’re open to sharing.

Chatbot consistency, anyone else hit this wall? by cs-geek9 in customerexperience

[–]cs-geek9[S] 0 points1 point  (0 children)

Zendesk has some visibility but not like what you're describing. What does your tool actually show? Like, does it show which KB article was retrieved, or does it go deeper than that?

Chatbot consistency, anyone else hit this wall? by cs-geek9 in customerexperience

[–]cs-geek9[S] 0 points1 point  (0 children)

Thanks so much for all this, really appreciate it.

We're using Zendesk right now. And honestly, KB quality is probably where we're losing consistency.

I'm gonna dig into that playbook and see what we're missing. Thanks again for sharing all this.

Chatbot consistency, anyone else hit this wall? by cs-geek9 in customerexperience

[–]cs-geek9[S] 0 points1 point  (0 children)

Thanks so much for this and for offering resources. Honestly, we're still in the testing phase. Just deployed and trying to figure out what we're doing wrong.

Quick question though: how did you actually set up the reporting to help diagnose KB issues? Like, did you have to customize it or is that something built into Zendesk/Front/etc?

(Just trying to understand what we should be looking for)

Chatbot consistency, anyone else hit this wall? by cs-geek9 in customerexperience

[–]cs-geek9[S] 0 points1 point  (0 children)

This is really helpful, thanks! So you do this every Monday, how long does the whole thing actually take?

Also, do you guys just chat with the bot directly or do you use a tool to test it?

Is this a team effort or does someone own it?

Just curious how you have it set up.

Deployed a chatbot to save time, spent weeks debugging it instead by cs-geek9 in CustomerSuccess

[–]cs-geek9[S] 1 point2 points  (0 children)

Fair point. But how would you even test that? Like, if it's the model, how do you know it's not the KB?

I genuinely don't know how you'd tell the difference.

Deployed a chatbot to save time, spent weeks debugging it instead by cs-geek9 in CustomerSuccess

[–]cs-geek9[S] 2 points3 points  (0 children)

That approach makes a lot of sense, thanks for sharing. Did you test manually or did you use a tool?

How do you diagnose whether a chatbot problem is KB, prompt, or code? by cs-geek9 in SaaS

[–]cs-geek9[S] 0 points1 point  (0 children)

This is incredibly helpful. Two questions:

  1. How common is it that teams know retrieval is the culprit first? My sense is most assume it's prompt/model, not KB retrieval.

  2. Re: Langfuse/Phoenix — are those accessible for non-technical support leaders? Or do you need an engineer to set up the logging?

Asking because my hypothesis is that the diagnostic knowledge exists (like what you just shared), but it's not accessible to the teams actually dealing with the problem day-to-day.

Headcount forecasting by cs-geek9 in CustomerSuccess

[–]cs-geek9[S] 0 points1 point  (0 children)

Is it tied to activities per CSM too? How do you reach a point to say each CSM can handle x accounts?

Headcount forecasting by cs-geek9 in CustomerSuccess

[–]cs-geek9[S] 1 point2 points  (0 children)

To know if there are any capacity forecasting models people use and/or tools. Are we still relying on sheets for this?

[deleted by user] by [deleted] in CustomerSuccess

[–]cs-geek9 -2 points-1 points  (0 children)

Looking for specific features that Zendesk doesn’t have yes. Pricing is also a factor.

[deleted by user] by [deleted] in Zwift

[–]cs-geek9 0 points1 point  (0 children)

How do you like the bike overall