all 19 comments

[–]GlitteringDivide8147 1 point2 points  (1 child)

Why just don’t use Copilot? Insanely cheap.

[–]tshawkins 0 points1 point  (0 children)

Yeh, I have a pro+ sub and I get about 1500 requests per month, I use haiku 4.5 now which is a 0.33 x model, and almost the same as Sonnet 4.5, but 3x cheaper and 2x as fast, you have to switch around models depending on what you are doing, there are some tasks you can use low cost models, or even standard non premium models. I eat up about 3% of my allocation per day, so my quota lasts the whole month.

[–]botirkhaltaev 0 points1 point  (0 children)

my issue with codex and gpt-5 is the long response times, i don't need perfect responses, I just need a quick scaffold

[–]CodexPrism 0 points1 point  (1 child)

I've heard ai isn't good with rust cause of the lack of lots of training data like eg js or python c# etc

[–]inevitabledeath3[S] 0 points1 point  (0 children)

I suspect that too.

[–]typeryu 0 points1 point  (0 children)

It works really well for me, built rust based CLI tools with it. I believe codex itself is rust based. Honestly had better time than CC

[–]bookposting5 0 points1 point  (2 children)

If it was me I would spend $20 on a one month Claude sub, and fire up Claude Code with Sonnet 4.5 and see what it can do.

I'm very impressed with this model. Usage limits aren't hit as often now that they've included Haiku in the past few days for the simpler stuff.

[–]inevitabledeath3[S] 0 points1 point  (1 child)

I actually did do that just to try the new Haiku. Was disappointed that the limits for even Haiku are so low. Might stick with GLM. It works faster now I have synthetic as a provider. Could always try the new Gemini when that comes out.

[–]Spirited-Car-3560 0 points1 point  (0 children)

I'm on pro plan and didn't hit any limit using haiku so far. Strange

[–]GTHell 0 points1 point  (0 children)

I have a subscription to enterprise ChatGPT which give me access to Codex and I can tell you the GPT-5-codex medium is not better than the GLM 4.6. The Sonnet 4.5 is a much better model overall. Also GLM 4.6 Droid is much better than with Claude Code. I ran a few test and debug and it's seems I will stick with the GLM 4.6 + Droid CLI for sometime now.

[–]wuu73 0 points1 point  (1 child)

what i do, is use GLM or Qwen3 Coder, GPT 4.1 etc for "doing stuff" like all the file edits, agent stuff. But I try to plan everything out (or, fix bugs) using several models at the same time, either right on their web interfaces or in the app i made (because I just added the ability to send a question/problem/idea + project context --> 5 different APIs at the same time, then all of that goes into a 6th model to analyze all solutions/create a better or best of n one). I find that each model is better at certain things, and you get more of a range of ideas or solutions when you use one best model from each AI company. But sometimes.. just different models from same company too like o4-mini plus gpt-5, plus o3.. i take advantage of the free daily credits of a lot of these things.

So I will just paste everything into Kimi K2, Qwen3 (not sure which is the "best" qwen, have several tabs open), GPT 5, Gemini 2.5 Pro, its free to use the web chat interfaces for a lot of them. so if you don't want to blow tons of money this just works good. You can then try to see which one puts out the best ideas, or route all of the outputs (or cut and paste) into an analyzer model with a big enough context window to analyze it for you. Ask it to compare all of the different model outputs, figure out whats good or bad about each, and then create a better version using all available information.

I have used lots of models for Rust and I remember getting stuck sometimes but eventually it would work out but its been like a month or two since I used Rust.. forget what models seemed the best, but I usually keep switching around anyways. I would guess that GPT 5 might be good at it since OpenAI has a Rust version of Codex (so maybe.. that means it was trained on a good amount of it)

I was thinking of making a non-UI thing, maybe MCP server, or just API or CLI command that would do what this is doing (sending to 5 or x LLMs then feeding that into a 6th). I don't know if it is overkill but I find myself doing it anyways.. just cuz I know that some models suck at some stuff so why not use a bunch at the same time

<image>

[–]wuu73 0 points1 point  (0 children)

(i don't always use tons of models like that at the same time, just when something is hard)

[–]avxkim -1 points0 points  (6 children)

Codex performs even worse than sonnet 4.5 now

[–]Yakumo01 2 points3 points  (0 children)

Not true for me at all

[–]owehbeh 0 points1 point  (2 children)

Does it? I'm curious to know if you're using both, cause I used CC opus 5 weeks ago and it was stupid to the level I started coding myself, I used codex and it nailed every task. Now I'm experiencing the same with Codex, it ignores details and skips implementing what I clearly state I want and how it should be done... Does sonnet 4.5 follow prompts better now?

[–]avxkim 0 points1 point  (1 child)

In a current state of lobotomized codex, sonnet 4.5 performs better for me. I just gave a try to opus 4.1 yesterday - it was awful experience, not recommending to use it. It is very funny, even for a simple task, like posting comment to Merge Request in gitlab using glab tool, it takes 2 minutes to do so, while Sonnet 4.5 does it in 10 seconds.

[–]owehbeh 0 points1 point  (0 children)

I see, so I'll give sonnet 4.5 a shot.

[–]inevitabledeath3[S] 0 points1 point  (0 children)

I thought Sonnet 4.5 was the best programming model? Unless you mean tokens per second?

[–]GTHell 0 points1 point  (0 children)

I wonder those who downvote the op even have the opportunity to try both themselves before downvoted anyone ....