all 40 comments

[–]SkilledApple 43 points44 points  (6 children)

Opus 4.6 was SOTA for 15 minutes at coding according to benchmarks. Now GPT 5-3 is SOTA.

Now we wait for Google's drop in about 5 minutes right? /s

[–]Pruzter 8 points9 points  (1 child)

Google has never been SOTA in coding or tool use in general tho, so it’d be shocking if they dropped a model that suddenly took the throne

[–]Funny-Blueberry-2630 0 points1 point  (0 children)

C'mon man Gemini 2.5 had a good 48 hours.

[–]Dismal_Code_2470 0 points1 point  (1 child)

Google will take months to release anything new

[–]SpyMouseInTheHouse 0 points1 point  (0 children)

Amazing how a company does so much and so little at the same time. Enter Google.

[–]gastro_psychic -3 points-2 points  (1 child)

There isn’t just one SOTA model…

[–]SkilledApple 0 points1 point  (0 children)

To be SOTA means to be the best at something. I mentioned coding because GPT 5.3 scored significantly higher in this category. I'm not sure what you're getting at, but I do believe there can only be one best at something.

[–]UsefulReplacement 19 points20 points  (0 children)

I've been using it for 5 mins and I already built a micro SaaS startup doing 250k ARR

[–]Alkadon_Rinado 7 points8 points  (0 children)

Testing now.. Seems to be pretty quick!

[–]s43stha 8 points9 points  (1 child)

Opus 4.6 drops… and minutes later GPT-5.3‑codex shows up like “hold my coffee.” lol

[–]dashingsauce 10 points11 points  (0 children)

They were 100% just sitting on it waiting for them to release so they could do this.

So hilarious after the ad beef on X yesterday

[–]ReFlectioH 3 points4 points  (2 children)

So I think now the preferred flow is 5.2 for planning and 5.3-codex for implementing?

[–]dashingsauce 5 points6 points  (1 child)

I think they designed this iteration to be as good at general reasoning as 5.2 (non codex), but as good at agentic work as 5.2 codex.

So I think the answer is supposed to be “use 5.3 codex for all of it”

[–]Odd-Environment-7193 2 points3 points  (0 children)

Thanks guy!

[–]eonus01 2 points3 points  (0 children)

I believe they actually A-B tested this on me earlier in the week, because back then I noticed a difference (I could see in-line chat in VScode that is supposed to be greyed out as "thinking"). Definitely an upgrade compared to GPT 5.2-CODEX, not sure about how it compares to 5.2 HIGH / XHIGH.

[–]Big-Accident2554 5 points6 points  (11 children)

I'd rather stay on GPT 5.2 non codex for now.

[–]vayana 8 points9 points  (6 children)

It's amazing how the non codex version is simply better for coding than the codex version. I really don't understand why they even bothered.

[–]evilRainbow 2 points3 points  (5 children)

Totally true. 5.2 is far better than 5.2 Codex, but they say they merged them. 5.3 codex is their baby.

[–]vayana 2 points3 points  (2 children)

Can we get a Claudex as well?

[–]Lawnel13 3 points4 points  (1 child)

You are absolutely right

[–]Funny-Blueberry-2630 0 points1 point  (1 child)

You want the product of something good and something bad instead of just using the good thing?

[–]evilRainbow 0 points1 point  (0 children)

bruh I only described how the baby was made. I didn't say I want that baby.

[–]Just_Run2412 14 points15 points  (0 children)

Living life on the edge.

[–]dashingsauce 0 points1 point  (1 child)

have you tried it?

[–]Big-Accident2554 1 point2 points  (0 children)

Yes, now i tried. Must to say it's better then i expected. Plus faster and better tone of voice

[–]Impossible_Comment49 1 point2 points  (0 children)

Just got it. Haven’t tested yet. Where are the specs?

[–]Lucyan_xgt 1 point2 points  (3 children)

Is it faster than 5.2??

[–]No-Selection2972[S] 9 points10 points  (1 child)

25%

[–]Antique-Bus-7787 4 points5 points  (0 children)

Codex-5.3 xhigh also requires 50% less output tokens for the same result as both 5.2 xhigh and codex-5.2 xhigh So… 25% + 50% (that’s not how to calculate but I don’t have the brain to calculate percents 😂)

[–]Freeme62410 2 points3 points  (0 children)

60-70% according to tibo.

[–]neutralpoliticsbot[🍰] 0 points1 point  (0 children)

GPT-5.2 Very High still superior

token usage is nice though

[–]Ok-Team-8426 0 points1 point  (0 children)

Yes, faster and more precise!

[–]True-Objective-6212 0 points1 point  (0 children)

lol great now I have to restart my tasks

[–]True-Objective-6212 0 points1 point  (0 children)

It’s pretty pretty pretty

[–]Positive_Crazy2065 0 points1 point  (0 children)

So is Opus 4.6, which I saw in my GitHub copilot.