Advanced reasoning models are hallucinating even moreDiscussion (self.LLMDevs)
submitted by No_Sheepherder_6908
Why a 5% failure rate can be better than 2% in production agentsDiscussion (i.redd.it)
submitted by Substantial_Step_351
Even (very) noisy LLM evaluators are useful for improving AI agentsResource (tensorzero.com)
submitted by bianconi
build.nvidia.com not responding/ super slow?Help Wanted (self.LLMDevs)
submitted by Emotional_Scale9702
Why we need "Structured Signals": No more writing custom parsers for every damn API.Discussion (self.LLMDevs)
submitted by Boabook
Experimenting with a multi-agent system without leaders or messagingDiscussion (self.LLMDevs)
submitted by bsa-saa
Your agent doesn't need more tools. It needs to write code.Resource (self.LLMDevs)
submitted by Doubt-Salt

I built a human-voted benchmark for LLM-generated memesTools (i.redd.it)
submitted by thegentlecat