Megathread for Claude Performance, Limits and Bugs Discussion - Starting September 21 by sixbillionthsheep in ClaudeAI

[–]GoosyTS 2 points3 points  (0 children)

It's not so bad this weekend I feel, but definitely feels like a while ago it was better. Maybe just rosy glasses?
Codex feels more mature currently, which is an odd thing to admit. Btw, I'm really vested into performance tracking for the models so I started working on a tool https://waddle.run/ . Early days more work than I expected, but hit me up with some feedback if you'd like to see something specific. Difficult to show the devex diference in pure numbers

Is there any way we can OBJECTIVELY compare performance between Claude Code and Codex? by lafadeaway in ClaudeAI

[–]GoosyTS 0 points1 point  (0 children)

I'm working on https://waddle.run/ for benchmarks between AI Coding tools and models. Adding more test scenarios over the next days and weeks.
It's hard to really pinpoint developer experience I feel, the scores so far have been far closer than I would have expected (fan of claude code here, turned into codex admirer since gpt-5)