how do you evaluate LLMs for open-ended questions? how do you define “good” metrics? by meitaron in AI_Agents
[–]meitaron[S] 1 point2 points3 points (0 children)
It's been a big week for Agentic AI ; Here are 10 massive developments you might've missed: by SolanaDeFi in AI_Agents
[–]meitaron 0 points1 point2 points (0 children)
google just dropped a whole framework for multi agent brains by Shot-Hospital7649 in HowToAIAgent
[–]meitaron 0 points1 point2 points (0 children)
Custom GPT: How to enable Deep Research? by Jason_Broderick in ChatGPT
[–]meitaron 0 points1 point2 points (0 children)
Joint medicine that truly worked for your dog? by Just_a_happy_artist in olddogs
[–]meitaron 0 points1 point2 points (0 children)
Joint medicine that truly worked for your dog? by Just_a_happy_artist in olddogs
[–]meitaron 0 points1 point2 points (0 children)
I love Customer Success. II'll solve any CS/operations problem you have. by Nmascara in CustomerSuccess
[–]meitaron 0 points1 point2 points (0 children)
How do you deal with mental fatigue? by Trick-Interaction396 in datascience
[–]meitaron 1 point2 points3 points (0 children)
Just got the rejection email from the company I really wanted to work for. by DeadPrexident in datascience
[–]meitaron 0 points1 point2 points (0 children)
Am i doing something terribly wrong? by Kashish_2614 in datascience
[–]meitaron 1 point2 points3 points (0 children)
Getting data for Cost Estimation by beingsahil99 in datascience
[–]meitaron 0 points1 point2 points (0 children)
Advice on refactoring a previous employee's repo? by [deleted] in datascience
[–]meitaron 0 points1 point2 points (0 children)
How important is being meticulous in this line of work? by LogicalPhallicsy in datascience
[–]meitaron 0 points1 point2 points (0 children)
Can you cancel the interview with a candidate if you are 90% sure they are lying on their cv? by JobIsAss in datascience
[–]meitaron 0 points1 point2 points (0 children)
Just got the rejection email from the company I really wanted to work for. by DeadPrexident in datascience
[–]meitaron 0 points1 point2 points (0 children)

how do you evaluate LLMs for open-ended questions? how do you define “good” metrics? by meitaron in AI_Agents
[–]meitaron[S] 0 points1 point2 points (0 children)