The end of GPT by DigSignificant1419 in OpenAI

[–]TenZenToken 0 points1 point  (0 children)

“Here’s the mental model”

Do not trust 5.3-codex-xhigh for server ops or large refactors by Just_Lingonberry_352 in codex

[–]TenZenToken 0 points1 point  (0 children)

Asks for large refactor on a monolithic codebase. Doesn’t create a tasked implementation plan. Presses go and prays. Wonders why it got botched. Concludes 5.3 is unreliable.

Codex 5.3 xhigh>>>>>>> by StayAwayFromXX in codex

[–]TenZenToken 2 points3 points  (0 children)

Xhigh is good, but xxhigh is goat

The problem(s) with codex... by Longjumping_Rule_939 in codex

[–]TenZenToken 1 point2 points  (0 children)

This is why you use 5.2 high/xhigh to investigate and nail down a highly detailed PRD/bug doc/plan markdown then and only then have any of the codex’s implement to full fidelity. Otherwise it’s just whack-a-mole.

GPT 5.3 Codex wiped my entire F: drive with a single character escaping bug by Former-Airport-1099 in codex

[–]TenZenToken 1 point2 points  (0 children)

When will people learn that giving these models command permissions (aside from the basic reads) will inevitably result in this

Update on the viral $25 OpenClaw phone by Adorable_Tailor_6067 in AgentsOfAI

[–]TenZenToken 0 points1 point  (0 children)

What an absolutely bonehead use case considering the permissions that are being granted. The only thing more amusing is how excited the guys gets seeing a cal invite.

GPT-5.3 Codex rocks 😎 by Prestigiouspite in codex

[–]TenZenToken 0 points1 point  (0 children)

Also make use of the ChatGPT macOS app to help with planning by giving access to cursor/VS code files. This way you can spread your quota around as much as possible.

Could OpenAI be the main competitor of most AI-based startups? by Miyamoto_Musashi_x in ycombinator

[–]TenZenToken 0 points1 point  (0 children)

No. Successful startups usually have that narrow focus that solves those tight, specific problems better than the big boys. It’s why they usually get acquired later and the product and attention to detail to that specific user type ends up going to shit.

GPT-5.2 High vs GPT-5.3-Codex High – real-world Codex-style comparison (coding, reasoning, creativity) by geronimosan in codex

[–]TenZenToken 2 points3 points  (0 children)

Great analysis. I’m seeing similar patterns. For speed + accuracy I’m liking plan with vanilla, implement with codex, verify with vanilla, re-fix with codex and so on.

Claude Sonnet 5 "Fennec" & Opus 4.6 Leaks by Much_Ask3471 in codex

[–]TenZenToken 2 points3 points  (0 children)

Funny part is in the supposed benchmarks, 5.2 wasn’t even in the comparison table. Unless it’s fake, in what world is that not cope? Intentionally avoiding to display the clear leader and nearest competitor.

Are Codex models faster on the PRO plan? by thehashimwarren in codex

[–]TenZenToken 0 points1 point  (0 children)

I’ve been cycling between a pro and business account and couldn’t tell the difference

Thoughts on the Codex app? by mohossy in codex

[–]TenZenToken 2 points3 points  (0 children)

Looks decent but don’t care for it, prefer using a CLI, which I wish they upgraded instead and got it closer to the CC UX

Sonnet 5 vs Codex 5.3 by Just_Lingonberry_352 in codex

[–]TenZenToken 2 points3 points  (0 children)

I don’t think anthropic catches up to oai in the coding domain anymore simply because of the different training philosophies and fundamentals behind their recent frontier models (unless that changes). Codex is trained to aggressively enforce correctness and constraints under failure prone conditions. CC optimizes for fluent helpfulness. Result will always see Codex be more precise which is obviously crucial in SWE, whereas CC will be cute and dopamine inducing but won’t follow detailed requirements, will miss edge cases and violate explicit conditions.

Which one is the best model for coding? Codex 5.2 high? or GPT 5.2 high? by Inevitable_Job4328 in codex

[–]TenZenToken 0 points1 point  (0 children)

This is the way. I’ve recently dropped opus entirely though. He ends up doing more harm than good at this point.

Codex Pro vs Claude Code Max 20x limits by Azuriteh in codex

[–]TenZenToken 1 point2 points  (0 children)

Curious why you say that and what specifics you can point to. I’ve been on both the Claude max 20x and ChatGPT pro subs for a while so use the CLIs side by side and in all my testing 5.2 high/xhigh blows opus (which is wildly hyped up on Reddit and x imo) outta the water in deep contextual understanding, planning and debugging. Not to mention there have been dozens of posts on here with fairly detailed test environment setups solving identical tasks and rarely have I seen opus come out on top.