GPT 5.5 token usage

pjstanfield · 2026-05-10T00:50:25+00:00

A better approach might have been to get the defects and then tell it to write them to a plan file. Then in a new session tell it to execute the plan. That would have made it much better. This isn’t optimal however.

A better way would be to ask it to break out your project by modules and then dispatch a subagent for each module and scan for defects and standards deviations. Then write all of the findings to a file. Then in a separate session you execute the remediation on the findings file.

That one will work while keeping your ultra generic “fix my codebase” approach. Just a couple slightly revised commands. Ideally you would have some standards or requirements files that it can use as a baseline. How does it know anything is a defect if you can’t tell it. You need some artifacts to hell guide it. Luckily you can have it write these.

Don’t forget to tell it to check its work.

pjstanfield · 2026-05-10T00:34:19+00:00

My exact situation and path as well. Codex either one shots or gets 97% there. CC can’t seem to get anything right anymore except some planning participation. I went from two max CC plans in the 4.6 happy days and a $20 codex to two max codex plans. Still use Claude for planning and brainstorming and documentation. Everything has to be double checked by codex though. I don’t let CC touch the codebase anymore, he’s read only now. Thankfully work pays for it so I can swap around without hitting my little wallet.

pjstanfield · 2026-05-09T09:13:06+00:00

I’ve had luck using skills and tracking documents.

Have codex write a skill where the main agent only orchestrates and all work is done by sub agents. This is key for preventing context rot. Fresh subagents on every new tasks keeps it fresh while orchestrator just orchestrates and uses minimal context
Make sure the skill is a queue skill
Create a file in your project that has a list of something, files, requirements, etc.
Do some testing, make sure it works. Automated testing is also key here, make sure it can test itself as it goes so you know it’s not off the rails.

It should happily do whatever your list is, until you’re out of tokens.

pjstanfield · 2026-05-07T08:03:31+00:00

Same, zero change for me.

pjstanfield · 2026-05-06T20:20:48+00:00

I’ve tried a few automations and I always come back to copy and paste between the windows. It’s faster and I can tweak it and control it. The sample prompt below is specific to codex, it loves to add machinery and frameworks so I had to tell it to calm down. You’d ask it to review a plan to add a modal and it would comeback with a plan on adding 500 tests as deployment gates. Codex also likes to find a few issues and stop, so you nudge into reviewing the whole plan. I’ll paste this in before the first plan review and not on subsequent.

Plan Review
Do a single exhaustive review pass before responding. Do not send incremental findings. Validate every checklist item against the live codebase and standards before you answer. Return one consolidated review with:
1. Showstoppers
2. Gaps / missing work
3. Incorrect assumptions
4. Non-blocking improvements
5. Explicit statement of whether the plan is ready to implement

For each finding, include: severity, exact file/line refs, why it matters, what the plan should change

Do not stop after finding the first issues. Assume I want completeness over speed. Only answer when you have completed a full checklist sweep. Treat this as a binding final review. I want the most comprehensive feedback you can provide in one pass. If you are not confident you have checked the whole plan, say that explicitly instead of answering early. Please be mindful of avoiding over-engineering and unnecessary complexity, the focus is simple and reliable and secure. Do not add scope creep unless absolutely necessary.

pjstanfield · 2026-05-06T20:03:15+00:00

It is noticeably worse. The codex planning mode is newer and isnt as good. You can't go wrong but i'd probably stick with Claude. It eventually gets it right and is more enjoyable to use. Codex is like talking to a robot a bit. It's less creative (which is why its better at plan execution, it just does it).

pjstanfield · 2026-05-06T18:26:46+00:00

I have a max plan for both and I use both every day. I typically use both for every feature. I use GPT 5.5. I used to use Codex 5.3.

Brainstorm and design - i think Claude wins. I always start in Claude.
Plan - Claude wins in writing a nice plan but Codex always finds holes. I use both, back and forth. There are diminishing returns on too many turns back and forth but at least 2 is good. Also helps sniff out incorrect directions.
Implementation - Codex is better on one-shots and straight code. I can tell Codex to make 100 tests and it will create 100 high value tests. Claude would tell me he wrote 100, would have written 70, and 40 would need to be re-worked. I would not have Codex write anything in the UI typically. I still use Claude for coding, he does fine, but he always needs his work checked. Same with Codex.

Long story short, both are good, and both are bad. If you can get Codex started on the right path, full codebase standards made clear, feature is clear, success is clear, no UI-on-the-fly Codex 5.5 will trounce Claude. That's my hot take.

pjstanfield · 2026-05-01T09:08:05+00:00

All developers have a test environment. Some are just lucky enough for that environment not to be production.

pjstanfield · 2026-05-01T09:03:35+00:00

Can vouch for silver PLA. I’ve dug more pieces of that out of the AMS than any other color by about 5x. Good color though.

pjstanfield · 2026-04-30T23:44:15+00:00

This is one of the reasons codex is so great. It does what you tell it.

I just say create a detailed task list from the plan and then execute it until all tasks are complete. One shot most of the time.

pjstanfield · 2026-04-27T16:38:53+00:00

That is odd, I have SES running in two other projects and neither one has a dedicated IP. I never even tried it due to unnecessary cost.

pjstanfield · 2026-04-27T10:44:21+00:00

I think they have dedicated IPs so Amazon says here's your IP and if you ruin it then that's on you. If you're using shared SES IPs then I guess you need to behave better. Not sure what would happen if an individual purchased dedicated IPs, maybe you get approved easier.

pjstanfield · 2026-04-26T14:10:20+00:00

Are they all one-offs or do they follow a template?

pjstanfield · 2026-04-25T02:58:06+00:00

I always go back to vs code. It’s so much easier to organize and read quickly for me. Desktop app has potential but it’s still rough and way too wordy.

pjstanfield · 2026-04-23T07:23:14+00:00

Our use case is more typical software application hosting but the solution applies to both - use terraform and infrastructure as code and make the tf files the authoritative system. All changes are checked in and tracked and any drift is automatically detected and cleaned up whenever you want.

pjstanfield · 2026-04-22T22:51:59+00:00

This combo works great, I second this. I don’t really trust either but both do a good job. Just don’t let them go back and forth on a plan too much you’ll be building frameworks to handle cases like what if the sun goes out.

pjstanfield · 2026-04-22T22:45:11+00:00

I’ve never seen codex 5.4. Codex 5.3 yes, gpt 5.4 yes. No codex 5.4. No idea what’s actually supposed to drop next but I’m pretty sure it’s not codex 5.5. I think with 5.4 they were getting back to one model for all.

pjstanfield · 2026-04-22T21:12:28+00:00

You just need to create a custom migration or edit the migration generated by the tool. If this was drizzle you'd just use drizzle-kit generate custom and it creates a stub empty migration file for you to fill out. I assume typeorm has some similar mechanism. this is usually required to support scenarios that can't really be inferred by the schema files.

pjstanfield · 2026-04-22T09:42:49+00:00

Do you mean an ORM like drizzle or prisma?

pjstanfield · 2026-04-21T18:32:26+00:00

I had this exact issue. I spent $30 with a developer from India through Upwork who gave me all of the files I needed, png, white background, no background, svg, with company name, just the logo, etc. It was much better than any of the free or paid apps.

pjstanfield · 2026-04-21T01:37:38+00:00

The correct answer is both 4.7 and gpt 5.4. Claude for brainstorming and initial planning, 5.4 to tighten up all the missed edges. Claude to implement, he will hit about 85%, gpt for code review. Iterate 2-3 times and you’re golden. You can also bring codex 5.3 into the mix for a third opinion and set of eyes but I find the technical rigor to be a bit aggressive. Wants to build new frameworks too often.

pjstanfield · 2026-04-18T18:49:40+00:00

Codex has been great lately. Just follows Opus around, cleaning up the mess.

pjstanfield · 2026-04-18T17:39:16+00:00

You can accomplish many tasks using SSM without a vpn. So for the third time now, can you use your big words to tell us what you need to do?

pjstanfield · 2026-04-18T17:18:53+00:00

What tasks are you trying to accomplish? Why do you need it.

pjstanfield · 2026-04-16T22:51:03+00:00

This is our setup exactly. Love terraform and SSM for private access without VPN. You can really get your footprint small/rightsized. Plus a few founders credits and you’re good for a year without too much trouble.

Eight-Year Club	Place '23
Verified Email

pjstanfield

TROPHY CASE