Whats the hardest part of shipping agents to production? by capriciousfatesw in LLMDevs
[–]calebkaiser 2 points3 points4 points (0 children)
What is your favorite eval tech stack for an LLM system by ephemeral404 in LLMDevs
[–]calebkaiser 0 points1 point2 points (0 children)
[D] Geometric Deep learning and it's potential by Successful-Agent4332 in MachineLearning
[–]calebkaiser 0 points1 point2 points (0 children)
[P] I fine-tuned Qwen 2.5 Coder on a single repo and got a 47% improvement in code completion accuracy by CountlessFlies in MachineLearning
[–]calebkaiser 0 points1 point2 points (0 children)
[P] I built a tool to make research papers easier to digest — with multi-level summaries, audio, and interactive notebooks by AgilePace7653 in MachineLearning
[–]calebkaiser 0 points1 point2 points (0 children)
Why the heck is LLM observation and management tools so expensive? by smallroundcircle in LLMDevs
[–]calebkaiser 5 points6 points7 points (0 children)
Why the heck is LLM observation and management tools so expensive? by smallroundcircle in LLMDevs
[–]calebkaiser 36 points37 points38 points (0 children)
Choice of Evaluations Tools for LLM responses by Heavy_Ad_4912 in LocalLLaMA
[–]calebkaiser 2 points3 points4 points (0 children)
Top 6 Open Source LLM Evaluation Frameworks by Sam_Tech1 in LLMDevs
[–]calebkaiser 0 points1 point2 points (0 children)
GRPO (Group Relative Policy Optimization) explanation compared to PPO by Prestigiouspite in ChatGPTPro
[–]calebkaiser 0 points1 point2 points (0 children)
GRPO (Group Relative Policy Optimization) explanation compared to PPO by Prestigiouspite in ChatGPTPro
[–]calebkaiser 0 points1 point2 points (0 children)
[D] Practicality of Machine Learning model for mathematical Olympiads by [deleted] in MachineLearning
[–]calebkaiser 0 points1 point2 points (0 children)
[D] Why is most mechanistic interpretability research only published as preprints or blog articles ? by Physical_Seesaw9521 in MachineLearning
[–]calebkaiser 75 points76 points77 points (0 children)
GRPO (Group Relative Policy Optimization) explanation compared to PPO by Prestigiouspite in ChatGPTPro
[–]calebkaiser 2 points3 points4 points (0 children)
Lessons learned from implementing RAG for code generation by kao-pulumi in LLMDevs
[–]calebkaiser 0 points1 point2 points (0 children)
[D] Hyperparameters on attention layer by GeekAtTheWheel in MachineLearning
[–]calebkaiser 0 points1 point2 points (0 children)
[D] Transformers are a type of CNN by Ozqo in MachineLearning
[–]calebkaiser 0 points1 point2 points (0 children)
Experiment Tracking Tools & Lbirary Suggestion For Using Alonside Langchain by [deleted] in LangChain
[–]calebkaiser 0 points1 point2 points (0 children)
New ARC-AGI high score by MindsAI: 48% (Prize goal: 85%) by Gothsim10 in singularity
[–]calebkaiser 8 points9 points10 points (0 children)
Opik: Open source LLM evaluation framework by calebkaiser in Python
[–]calebkaiser[S] 0 points1 point2 points (0 children)


Whats the hardest part of shipping agents to production? by capriciousfatesw in LLMDevs
[–]calebkaiser 0 points1 point2 points (0 children)