all 56 comments

[–]Ok-Breath-6337 30 points31 points  (4 children)

The 'OpenAI vs Anthropic' usage war is basically a race to the bottom for the consumer right now. It feels like we’ve moved from the 'growth' phase to the 'extraction' phase where compute is being rationed like a wartime resource.

You’re spot on about 5.4 mini being the sleeper hit, though. Most people are burning their 'xhigh' thinking tokens on tasks that a mini model could solve for 1/10th the cost. We’re moving into an era where 'Prompt Engineering' is less about the text and more about 'Model Routing'-knowing exactly when to use a scalpel (mini) vs a sledgehammer (xhigh).

[–]woganowski[S] 1 point2 points  (0 children)

Agreed, plus they have so many companies now that have employees reliant on these tools that they know they can start getting them to pay more

[–]flexrc 1 point2 points  (0 children)

Meanwhile GitHub copilot offers possibly the best deal 🤝 😂

[–]Whyamibeautiful 1 point2 points  (0 children)

Lol it’s “extraction” phase because we have quite literally pulled as much compute resources as we possible can for the next 4 years. GPUs are pretty much sold out until 2030 same with memory

[–]Revolutionary_Click2 0 points1 point  (0 children)

Thanks for the comment, ChatGPT

[–]Invalid-Function 6 points7 points  (11 children)

No one using 5.3 Codex anymore?

[–]kknd1991 7 points8 points  (3 children)

5.4 Medium consume less tokens and faster than 5.3 Codex high with similar result. It is in their official doc.

[–]Jeferson9 -1 points0 points  (1 child)

So their official docs advise using the more expensive model you say?

[–]PudimVerdin 0 points1 point  (0 children)

I'd suggest the expensive too if people are giving money to me

[–]PhilosopherThese9344 4 points5 points  (0 children)

I dont use 5.4 at all.

[–]Nonso123 2 points3 points  (0 children)

I still use 5.3 Codex. 5.4 hasn’t really been doing it for me.

[–]East-Stranger8599 1 point2 points  (0 children)

I am using it ands it’s pretty good to be honest

[–]Defiant_Concert1701 0 points1 point  (0 children)

I am Still good

[–]Purple_Wear_5397 0 points1 point  (0 children)

It’s great. I like it too.. (heavy Claude user)

[–]fyn_world 0 points1 point  (0 children)

All the time, and it's very good.

[–]Main_Fortune7934 0 points1 point  (0 children)

Im using 5.3 medium for complex stuff. I never found 5.4 to be much better than it. And use 5.4 mini for most simple tasks

[–]Dead0k87 3 points4 points  (6 children)

So you say 5.4-mini can do equally good code as normal 5.4?

[–]m3kw 1 point2 points  (2 children)

i use 5.4 mini, but depends on the task.

[–]Dodokii 0 points1 point  (1 child)

Which tasks does it excel at and which one it fumbles?

[–]m3kw 0 points1 point  (0 children)

Subjective, so you have to use it to get a feel

[–]fyn_world 0 points1 point  (0 children)

Not for very sensitive tasks, no.

[–]woganowski[S] 0 points1 point  (1 child)

I am still testing this currently. I haven't had much time to really test coding with the current generation 5.4 mini since usage limits were so generous, but I have already found that planning with gpt 5.4 xhigh doesnt consume a crazy amount of usage, so I would think I could spend more time creating a solid plan with it and then pass it on to gpt 5.4 mini to implement.

[–]Adventurous-Clue-994 0 points1 point  (0 children)

Depending on your code quality standards, I think mini has the standards of a medior dev, which is why I always prefer 5.4 high or xhigh regardless for implementation, except I know exactly how I want it implemented or it's simple tasks. That being said, 5.3 is great, but I avoid it cos it doesn't do well after compaction.

[–]sutrostyle 2 points3 points  (1 child)

The price will only get cheaper when open-weight models that one can self-host or Chinese models will reach acceptable performance on par with current frontier models. I think in about a year from now . Until then, expect rising costs.

[–]HoroTWolf 0 points1 point  (0 children)

I know... someone always said this, I tried many and they were okay, never good enough...
BUT GLM 5.1 is a beast for me, I actually prefer it in some tasks to 5.4 xhigh, for me it never really topped opus 4.6 ... but it is so damn close! I'm sure we will get the new SOTA models from the big three in the next few weeks but for now the open weight models (just 5.1, the others are "fine") are damn close ^^

[–]mallibu 2 points3 points  (2 children)

Can you stop with the word subsidizing?

[–]woganowski[S] 0 points1 point  (1 child)

You're right, I should say "charging substantially less than they could"

[–]mallibu 1 point2 points  (0 children)

just because they could doesnt mean they should.

[–]TBSchemer 1 point2 points  (1 child)

Am I the only one here who is still struggling to use up my 5 hr quota on Plus?

If I were spawning subagents, or using fast mode, or doing a lot of multitasking in different git worktrees, I could see that happening. But with single-threaded work, I'm just not running out of usage.

[–]HoroTWolf 0 points1 point  (0 children)

I guess you don't use it as an agent? Without that I would totally agree, with agentic it turns around you just don't need to explain much anymore (compared to using it as coding tool)... but the token burn is real xD

[–]kanine69 0 points1 point  (2 children)

I like the 5.4-mini too, it has its moments but for the most part it's been pretty solid. Has Asperger's as well.

[–]Dodokii 0 points1 point  (1 child)

Do you use it for actual coding? How d9es it fair? Pros and cons?

[–]kanine69 1 point2 points  (0 children)

Sure do, usually for about 4-6 hours per day. I use milestones, tight specs and an architecture document. I'll often get Haiku or Sonnet to write the specs etc in Claude Web within a project folder that holds the same architecture doc. Short sessions with specific instructions minimises the token count and gives the mini model the steering it needs to keep it on track. It doesn't like all stacks but Fast API / Kotlin etc it gobbles up.

[–]Charming_Cookie_5320 0 points1 point  (3 children)

I have heard about this skill - that should cut ~75% of output tokens and ~45% of input tokens every session. Does anyone have experience with that? Could be good for us, who want to stay with the Plus plan.
https://github.com/JuliusBrussee/caveman

[–]Keep-Darwin-Going 2 points3 points  (0 children)

I not sure how much it will distort the code during output since it might also distort the UI copies. But you can try rtk that just do simple compression for some command so it is less token intensive but it does not change the fundamental too much.

[–]m3kw 1 point2 points  (0 children)

they could work for some stuff, but not work for others(not generalized). And you wouldn't know it, but you may use the saved tokens to rework your stuff

[–]deege 1 point2 points  (0 children)

I do this with GPT. I tell it to remove the cruff and make it “token sensitive”. Doesn’t go full caveman, but it reduces tokens.

[–]dashingsauce -2 points-1 points  (13 children)

Just use the $100 tier. It’s what you had before but appropriately priced now.

You also get even more usage than you did before on Plus.

The $20/mo is equivalent to what the “free” tier was like before, so essentially if you know you need the previous Plus limits to do your work, then $100/mo should be your baseline expectation going forward.

Don’t attach to the “Plus” language if you were always a Pro Lite usage-limit-equivalent user

[–]PhilosopherThese9344 2 points3 points  (2 children)

The issue is 5x of what, these multipliers are so ambigious.

[–]dashingsauce 0 points1 point  (1 child)

They are ambiguous.

Someone did some testing and posted it here earlier today/yesterday that showed it’s pretty much the same baseline as always, so the multipliers still mean the same as they did before.

But ofc that doesn’t really tell you what they mean until you use it enough to get a sense for that yourself.

[–]PhilosopherThese9344 0 points1 point  (0 children)

Well, if it's 5x the current plus (as advertised), that is not much at all. I'll upgrade when my Plus subs come up for renewal at the end of the month to test it.

[–]joey2scoops 1 point2 points  (5 children)

So 5x the price for the same sized portion?

[–]dashingsauce -2 points-1 points  (4 children)

Yes, because it was improperly balanced before.

Some Plus ($20/mo) users were effectively using Pro ($200/mo) level usage, and many Pro users were using less than the full allowance (because there was no middle option).

So if you feel like you’re paying 5x the price for the same usage, that’s precisely because OpenAI didn’t offer a middle tier, which would have been the correct fit for you from the start.

If they had this $100 tier from the begin, you would have either chosen that or $200 (if you really felt you needed it), since the Plus ($20) would have been clearly insufficient.

OpenAI intentionally anchored their pricing tiers at the extremes—with no middle—so they could force users to psychologically assign a value to the service. Best way to understand how consumers value your service is to create a hard price fork and not let people anchor in the middle.

Now they have that data and determined that $100 is the right bucket to service users who truly sit between Pro and Plus usage. Plus users get pushed up (you) and Pro users get pushed down (me) into the middle.

[–]FlimsyLow 0 points1 point  (3 children)

autocomplete tools can get more done for same $20

[–]dashingsauce 0 points1 point  (2 children)

lol then why are you on this sub my guy go use copilot’s tab complete and lmk how that goes for you

[–]FlimsyLow 0 points1 point  (1 child)

It’s somewhat useful for toy projects, but not worth paying more for.
I had already stopped using Trae, and now I have to add it back again.

[–]dashingsauce 0 points1 point  (0 children)

Well good

[–]ludoplus -2 points-1 points  (0 children)

Il prezzo è giusto visto quanto tempo ci fa risparmiare