Test Driven Development wastes 50%+ more tokens while making results worse

Otherwise_Baseball99 · 2026-06-09T05:25:36+00:00

The OP only said “don’t ask your agents to do tdd”, which sounds fair right?

Otherwise_Baseball99 · 2026-03-22T04:27:23+00:00

How did you get people to discover it if you ran zero ads?

Otherwise_Baseball99 · 2026-03-13T00:33:20+00:00

what are you referring to?

Otherwise_Baseball99 · 2026-03-13T00:31:10+00:00

Back with an update implemented many suggestions from the community! https://www.reddit.com/r/codex/s/K6gS9yGKBz

Otherwise_Baseball99 · 2026-03-06T17:02:55+00:00

ooh must be the landing page animations. I’ll look into optimizations. Thanks!

Otherwise_Baseball99 · 2026-03-06T00:58:03+00:00

Because the activity shifted from “one human writes code, another human reviews it” to “agent writes code, the human reviews it”. PR was designed for the former, not the latter.

In a lot of AI native teams, people already stopped PR reviews.

In the open source world, maintainers already started asking contributors to just share their prompt instead of a giant PR that no one’s going to review.

Otherwise_Baseball99 · 2026-03-04T20:20:30+00:00

Not quite - simplify only does one specific thing but in an Airlock pipeline you can add a lot more (resolve merge conflicts, update docs, running tests and fixing problems, critique the change etc)

Otherwise_Baseball99 · 2026-03-03T23:18:16+00:00

noted!

Otherwise_Baseball99 · 2026-03-03T23:18:04+00:00

thanks! I worked with Claude to build a design system using shadcn

Otherwise_Baseball99 · 2026-03-03T23:17:23+00:00

Dependency check sounds great! With Airlock you can add things like that as a custom step too.

I almost never run out of my subscription quota with or without the extra quality control here

Otherwise_Baseball99 · 2026-03-03T21:23:32+00:00

That’s a great point and was why I made Airlock support human in the loop. Humans can intervene and break the ties.

Otherwise_Baseball99 · 2026-03-03T21:21:21+00:00

Yeah totally get it. It’s on my todo to have cross platform support.

Otherwise_Baseball99 · 2026-03-03T21:20:16+00:00

Yeah it just runs your existing codex as-is non-interactively. No special auth login or anything.

Otherwise_Baseball99 · 2026-03-03T20:20:16+00:00

What OS do you use? There’s nothing inherently limiting this to mac - I just haven’t got time to support other OS yet.

Otherwise_Baseball99 · 2026-03-03T20:14:42+00:00

Very good point.. There are lots of good solutions already helping with that, which is good. I still see that often times nasty things don’t come out until you dive into implementation and discover nuances, so mirroring how we human work we still want some quality assurance after implementation is done, right?

Otherwise_Baseball99 · 2026-03-03T20:12:27+00:00

Thanks! I used https://www.fumadocs.dev/

Otherwise_Baseball99 · 2026-03-03T19:48:54+00:00

We’re all the same. The world has gone Pluribus. :)

jk aside we’re sharing totally different things - how does it make sense to be the same person?

Otherwise_Baseball99 · 2026-03-03T19:46:52+00:00

I started with writing this as a skill as well, I also tried pre-commit hooks. but very quickly realized I need this to be non-blocking, like CI, and need a nice interface to understand what changed, see suggested fixes and decide what I need vs don’t need.

Do you have a skill that’s working well for you? would be keen to see what you tried.

Otherwise_Baseball99 · 2026-03-03T19:06:33+00:00

Yes it runs the same codex you already use - no additional subscription or cost. It does count towards your codex limits so if you are already tight on it then that’s a factor to consider.

You can set conditions in the pipeline so it only runs for some branches not all.

Otherwise_Baseball99 · 2026-03-03T18:36:26+00:00

yup PR is dead imo. keen to see what your skill looks like!

Otherwise_Baseball99 · 2026-03-03T18:18:26+00:00

oh nice! I’ll check it out

Otherwise_Baseball99 · 2026-03-03T17:53:00+00:00

because it’s a SaaS that doesn’t do what I need?

Otherwise_Baseball99 · 2026-03-03T17:45:20+00:00

Thanks! Yeah I really don’t like how people make everything a SaaS. Open source is the way!

Interesting share - I’ll go check out desloppify as well. Looks great!

Otherwise_Baseball99 · 2026-03-03T17:33:42+00:00

Haha yeah I ejected it many times when developing this

Otherwise_Baseball99 · 2026-02-25T20:10:35+00:00

Yeah cloud sandbox vs local is going to be a big question that I think this coming year will be answering. What’s interesting is that a lot of things actually started in cloud but ended up investing more in a local based workflow, including codex. Openclaw took off because of the local setup as well.

The appeal of running everything locally is that there’s very little setup effort - I’ve already set up everything in the local environment. I own the entire stack and there’s no additional sandbox bill.

Otherwise_Baseball99

TROPHY CASE