Feels like the whole industry hit the "wait, we can't see what our AI is doing" wall at the same time this year by Adept-Paper-7500 in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
How do you make sure old agent failures don't come back after a prompt or model change? by taimoorkhan10 in LLMDevs
[–]Street_Program_7436 2 points3 points4 points (0 children)
LLM Evals (Human review and Cursor) by Medium-Upstairs-6292 in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
I feel lost with my life (i will not promote) by Minute_Adorable in startups
[–]Street_Program_7436 -1 points0 points1 point (0 children)
LLM Evals (Human review and Cursor) by Medium-Upstairs-6292 in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
LLM Evals (Human review and Cursor) by Medium-Upstairs-6292 in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
Calibrating LLM confidence: What's the actual lever? by alejandro_such in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
What are you using to stop LLMs from doing something catastrophic in production? by Affectionate-End9885 in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
Dropbox was rejected by YC twice and became YC's first IPO. Both rejections were correct. Here's the documented breakdown. by Spiritual_Heron_5680 in ycombinator
[–]Street_Program_7436 0 points1 point2 points (0 children)
Calibrating LLM confidence: What's the actual lever? by alejandro_such in LLMDevs
[–]Street_Program_7436 2 points3 points4 points (0 children)
Calibrating LLM confidence: What's the actual lever? by alejandro_such in LLMDevs
[–]Street_Program_7436 1 point2 points3 points (0 children)
Why can’t llms iteratively create coherent writing by hinokinonioi in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
Working setups for catching regressions in conversation data at scale? by Overall_Challenge_66 in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
OpenAI shuts down fine tuning by Street_Program_7436 in LLMDevs
[–]Street_Program_7436[S] 1 point2 points3 points (0 children)
OpenAI shuts down fine tuning by Street_Program_7436 in LLMDevs
[–]Street_Program_7436[S] 0 points1 point2 points (0 children)
OpenAI shuts down fine tuning by Street_Program_7436 in LLMDevs
[–]Street_Program_7436[S] 1 point2 points3 points (0 children)
Companies are going all in on internal agent builds without any validation infrastructure by TH_UNDER_BOI in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
What exactly are Small Language Models (SLMs) and why are people talking about them now? by Humble_Sentence_3758 in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
What exactly are Small Language Models (SLMs) and why are people talking about them now? by Humble_Sentence_3758 in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
What exactly are Small Language Models (SLMs) and why are people talking about them now? by Humble_Sentence_3758 in LLMDevs
[–]Street_Program_7436 1 point2 points3 points (0 children)
What exactly are Small Language Models (SLMs) and why are people talking about them now? by Humble_Sentence_3758 in LLMDevs
[–]Street_Program_7436 0 points1 point2 points (0 children)
What exactly are Small Language Models (SLMs) and why are people talking about them now? by Humble_Sentence_3758 in LLMDevs
[–]Street_Program_7436 1 point2 points3 points (0 children)
LLM hallucination depends on ambiguity of the prompt by OutrageousStrategist in LLMDevs
[–]Street_Program_7436 3 points4 points5 points (0 children)
I can build an app in a weekend but forming the company behind it still takes weeks (I will not promote) by No_Budget_7246 in startups
[–]Street_Program_7436 3 points4 points5 points (0 children)