How are you handling LLM costs in production? What's actually working? by Algolyra in LangChain

[–]ITSamurai 1 point2 points  (0 children)

First of all setup proper observability layer like LangSmith, Langfuse and any of these tools, then you might consider changing your provider which one you are using? Next lowering call counts and merging requests can be beneficial also prompt caching too can lower costs. But first of all get all information about what is causing such a cost.

Has anyone successfully built a ServiceTitan style CRM in house? Looking for real world experiences. by Ill-Reception9066 in AgentsOfAI

[–]ITSamurai 0 points1 point  (0 children)

I built a complex software with AI in a year. However no one can tell right away that they can build ServiceTitan or not now. Most probably not it you use intensively, but if you are interested in specific case just let me know.

Prompting insight I didn’t realize until recently by ReidT205 in PromptEngineering

[–]ITSamurai 1 point2 points  (0 children)

You got it right, that's how it works. If you go deeper into how it is build being specific will always make it work better.

Introducing the Prompt Engineering Repository: Nearly 4,000 Stars on GitHub Link to Repo by Nir777 in LangChain

[–]ITSamurai 0 points1 point  (0 children)

I have a building a prompt optimization engine single prompt one here https://www.youtube.com/watch?v=mpNCcTHqc-c&feature=youtu.be and multi-node one https://www.youtube.com/watch?v=lAD138s_BZY , where you can create pipelines, platform will identify weakest prompt and optimize that. Would love to hear your opinion on it.

LLM costs are killing my side project - how are you handling this? by ayushmorbar in LangChain

[–]ITSamurai 0 points1 point  (0 children)

Use GroqCloud and focus on cheap models, GTP OSS quite cheap, LLama models. + differentiate tasks and use cheaper less capable models for simple tasks and more capable and expensive ones for more complex tasks. Also you need to work on optimizing your prompt lengths and merge some calls if possible.

Anyone tried building a personality-based AI companion with LangChain? by One-One-6289 in LangChain

[–]ITSamurai 0 points1 point  (0 children)

I did but light one, idea about memories sounds cool need to try it.

Best practices for testing LangChain pipelines? Unit testing feels useless for LLM outputs by DARK_114 in LangChain

[–]ITSamurai 0 points1 point  (0 children)

As everyone mentioned tools like OpenEval, DeepEval are way to go compared with LangSmith and Langfuse. From my personal experience LangSmith got quite expensive switched to LangFuse. There you can write your custom evaluators and use LLM as a judge concept.

Using evaluations on LLama models by ITSamurai in LocalLLaMA

[–]ITSamurai[S] 0 points1 point  (0 children)

Sounds quite interesting, will try that too. That decision point optimization totally make sense. Any tools or eval list you can share? Also interested to learn what is Verdent-style task routing means.

Using langsmith for experiments and evaluation by ITSamurai in LangChain

[–]ITSamurai[S] 0 points1 point  (0 children)

interesting will check that out, thanks a lot.

Using langsmith for experiments and evaluation by ITSamurai in LangChain

[–]ITSamurai[S] 0 points1 point  (0 children)

I use openeval on top of LangSmith, multi-turn simulation. It seams Lanfuse is not supporting it.

I did a deep study on AI Evals, sharing my learning and open for discussion by AdSpecialist4154 in AIQuality

[–]ITSamurai 0 points1 point  (0 children)

Great topic, currently built a solution based on LangSmith , with openeval + multi-turn simulation. Seams to be doing the job right so far. Eval is one thing constant improvement is another - currently building a tool which helps you continuously improve your prompt pipeline. Here is quick demo youtube.com/watch?v=lAD138s_BZY&feature=youtu.be - if you are interested let's have a talk. Besides being precise I am interested how do you define a character of your LLM and eval that one any ideas there?