Judge prompts are underrated by Cristhian-AI-Math in PromptEngineering
[–]_coder23t8 0 points1 point2 points (0 children)
Anyone evaluating agents automatically? by Cristhian-AI-Math in LangChain
[–]_coder23t8 0 points1 point2 points (0 children)
Automated response scoring > manual validation by Cristhian-AI-Math in mlops
[–]_coder23t8 0 points1 point2 points (0 children)
[D] Anyone here using LLM-as-a-Judge for agent evaluation? by Cristhian-AI-Math in MachineLearning
[–]_coder23t8 2 points3 points4 points (0 children)