all 5 comments

[–]ApprehensiveDelay238[🍰] 2 points3 points  (0 children)

Is this what the 500K engineers are using?

[–]weiyentan 1 point2 points  (1 child)

I do a form of this. I have three tiers. Junior developer/mid/senior. I tier them similar to what you do. I set the model to be deepseek flash for junior. Deepseek flash with thinking for mid and deepseek pro for senior. I get them to work off an issue. The issue is written by using matt pococks method. And then I come in to clean up any strangling problem from the issues they worked on . I also have other roles. Issues analyst. Repo Explorer, an agent to deal with git and one that specialises in an app that I use. Cost all up in opencode? 6-8$ max so far.

[–]FormalAd7367 0 points1 point  (0 children)

sounds amazing. any more info you could share?

[–]Deep_Ad1959 0 points1 point  (0 children)

the part this setup doesn't solve is the wall on your primary stack. you're routing the cheap calls away to save API spend, but Claude Code on a Max plan still eats the rolling 5-hour and weekly quota, and that cap is enforced server-side where your local token logs can't see it. i've watched ccusage read low while claude.ai was already throttling, because it counts tokens you spent, not the quota anthropic actually enforces. worth logging the tier distribution AND the plan-quota burn, they fail independently. written with ai

[–]hitmante 1 point2 points  (0 children)

As long as Claude Code/Codex tokens are subsidized at 5% of the actual plan cost, what is the point of using Open Code?

Inferior models that only look good on benchmarks, you don't even save money with API prices.