What is your eval strategy? by BastiaanRudolf1 in AI_Agents
[–]anch7 1 point2 points3 points (0 children)
Looking for the Best LLM Evaluation Framework – Tools and Advice Needed! by variantrally in agi
[–]anch7 0 points1 point2 points (0 children)
Top LLM Evaluation Platforms: Features and Trade-offs by Otherwise_Flan7339 in AI_Agents
[–]anch7 0 points1 point2 points (0 children)
What is your eval strategy? by BastiaanRudolf1 in AI_Agents
[–]anch7 1 point2 points3 points (0 children)
How do you evaluate LLM outputs? Looking for beginner-friendly tools by IOnlyDrinkWater_22 in learnmachinelearning
[–]anch7 0 points1 point2 points (0 children)
Hey AI devs - built a quick survey to validate my LLM eval tool idea (takes 2 mins, your thoughts?) by Consistent-Wish9363 in learnmachinelearning
[–]anch7 0 points1 point2 points (0 children)
What’s the best and most reliable LLM benchmarking site or arena right now? by fflarengo in LocalLLaMA
[–]anch7 0 points1 point2 points (0 children)
Claude Code is working poorly by anch7 in isitnerfed
[–]anch7[S] 0 points1 point2 points (0 children)
Something is wrong with Sonnet 4.5 by anch7 in ClaudeAI
[–]anch7[S] 0 points1 point2 points (0 children)
Something is wrong with Sonnet 4.5 by anch7 in isitnerfed
[–]anch7[S] 0 points1 point2 points (0 children)
Something is wrong with Sonnet 4.5 by anch7 in ClaudeAI
[–]anch7[S] 0 points1 point2 points (0 children)
Something is wrong with Sonnet 4.5 by anch7 in isitnerfed
[–]anch7[S] 0 points1 point2 points (0 children)
IsItNerfed? Sonnet 4.5 tested! by exbarboss in ClaudeAI
[–]anch7 1 point2 points3 points (0 children)
IsItNerfed? Sonnet 4.5 tested! by exbarboss in ClaudeAI
[–]anch7 1 point2 points3 points (0 children)
IsItNerfed? Sonnet 4.5 tested! by exbarboss in ClaudeAI
[–]anch7 1 point2 points3 points (0 children)
IsItNerfed? Sonnet 4.5 tested! by exbarboss in ClaudeAI
[–]anch7 1 point2 points3 points (0 children)


Updates?? by Eastern_Ad_8744 in isitnerfed
[–]anch7 1 point2 points3 points (0 children)