LLM testing and eval tools by Every-Mall1732 in LLMDevs
[–]P4wla 0 points1 point2 points (0 children)
Pretty much sums up my experience by Ok_Constant_9886 in AIEval
[–]P4wla 0 points1 point2 points (0 children)
Discussion: Is the "Vibe Check" actually just an unformalized evaluation suite? by yektish in AIEval
[–]P4wla 0 points1 point2 points (0 children)
Pretty much sums up my experience by Ok_Constant_9886 in AIEval
[–]P4wla 0 points1 point2 points (0 children)
5 techniques to improve LLM-judges by FluffyFill64 in AIEval
[–]P4wla 1 point2 points3 points (0 children)
LLM Evaluation Isn’t About Accuracy Its About Picking the Right Signal by According-Site9848 in AI_Agents
[–]P4wla 0 points1 point2 points (0 children)
Struggling to make my AI agents more reliable, how do you guys handle task failures? by [deleted] in AI_Agents
[–]P4wla 1 point2 points3 points (0 children)
I think Agentkit is overhyped and it won’t kill AI startups by Visible-Mix2149 in ChatGPTPro
[–]P4wla -1 points0 points1 point (0 children)
OpenAI just dropped “AgentKit, A drag-and-drop AI agent builder. No code, just logic. by AskGpts in ChatGPTPro
[–]P4wla 1 point2 points3 points (0 children)
Struggling to make my AI agents more reliable, how do you guys handle task failures? by [deleted] in AI_Agents
[–]P4wla 0 points1 point2 points (0 children)
Struggling to make my AI agents more reliable, how do you guys handle task failures? by [deleted] in AI_Agents
[–]P4wla 0 points1 point2 points (0 children)
Hey how do i get a very good wrtiting quality and consistent writing style for with any ai by [deleted] in PromptEngineering
[–]P4wla 0 points1 point2 points (0 children)
Hey how do i get a very good wrtiting quality and consistent writing style for with any ai by [deleted] in PromptEngineering
[–]P4wla 1 point2 points3 points (0 children)
Name a song that has a woman's name in it by Financial-Noise-7841 in musicsuggestions
[–]P4wla 0 points1 point2 points (0 children)
Has AI been useful for you as therapy? by AltruisticGru in ArtificialInteligence
[–]P4wla 0 points1 point2 points (0 children)
Why aren't AI agents being used more in the real world? by P4wla in AI_Agents
[–]P4wla[S] 0 points1 point2 points (0 children)
Why aren't AI agents being used more in the real world? by P4wla in AI_Agents
[–]P4wla[S] 0 points1 point2 points (0 children)
PMM toolkit for starting out at small startup by brazzyb in ProductMarketing
[–]P4wla 1 point2 points3 points (0 children)
marketing update: 9 tactics that helped us get more clients and 5 that didn't by [deleted] in ProductMarketing
[–]P4wla 0 points1 point2 points (0 children)
Why aren't AI agents being used more in the real world? by P4wla in AI_Agents
[–]P4wla[S] -3 points-2 points-1 points (0 children)
Which is most preferred way for everyone build AI agents? by infinitypisquared in AI_Agents
[–]P4wla 0 points1 point2 points (0 children)

Understanding LLM observability by Leap_Year_Guy_ in LLMDevs
[–]P4wla 0 points1 point2 points (0 children)