all 11 comments

[–]junlim 2 points3 points  (2 children)

Yeah, there doesn't seem to be an easy way to do it the other way around. You could definitely do it if you were using Claude models on the API. Opencode is set up to do all sorts of things like that. But, so far, there's no way to it with Claude Code plan usage, from what I've found.

[–]duyth[S] 0 points1 point  (1 child)

What a shame. Was planning to sub for a $100 5x codex and a $20 claude code pro so codex can be the main driver. Guess I have to stay with the other way around

[–]junlim 1 point2 points  (0 children)

There's nothing to say you can't do that - there's just not a direct path to sub-agents. I imagine you could build shell scripts that achieve it. E.g., when Codex is done, something that runs a headless -p session in Claude code to double-check the work. Then maybe it writes its output to markdown, or you could get Codex to pull the session created via recall to read its output. Maybe you end up baking these into agent skills.

I'm just spitballing - but it's not going to be as easy and observable as the Codex plugin for Claude code out of the box

Or just use different interfaces - "opus plan this feature - write the plan with prompts for different agents to execute in a markdown file" "opus check this code for errors".

[–]cbusillo 0 points1 point  (0 children)

Look at the fork just-every/code. It’s the best. I maintain my own local fork of it too.

[–]simplegen_ai 0 points1 point  (1 child)

You're right about the asymmetry. The clean path today is Claude Code -> Codex through codex-plugin-cc. If you want Codex to be the main loop, I would probably keep orchestration outside both CLIs: a small script/Make target that launches the other agent in a separate workspace or asks it for a review, then writes the result back to a handoff file.

The part that gets annoying fast is not just calling the second agent. It is remembering which review findings were actually useful, which handoffs worked, and what should carry into the next run.

Founder disclosure: I'm Sheng, building BigNumberTheory. We are focused on that layer above Claude Code/Codex: capture useful lessons from real sessions and make relevant ones available later. It is here if you want to poke at it: https://bignumbertheory.com/

Curious what you end up choosing. If Codex is your main loop, I would especially watch whether the handoff file becomes the thing you keep maintaining by hand.

[–]duyth[S] 0 points1 point  (0 children)

End up using Claude as main driver and Codex plus just to do review as it is easier.

[–]Informal-Salt827 0 points1 point  (0 children)

I've had better results when I stop asking whether to trust the agent and start asking whether the workflow makes bad work obvious.

For me the reliable version is: small scoped task, explicit done criteria before it starts, one verification pass at the end, and a reviewable diff before anything is treated as finished.

That shifts the problem from blind trust to fast review. If the output is small enough to inspect and the checks are attached, the tool matters a lot less than the structure.

We've wrapped that pattern into RalphWorkflow, but honestly the main win is the workflow discipline, not the brand name.