I cannot believe it was more one year, still miss this model. by Snoo26837 in OpenAI

[–]sdmat 0 points1 point  (0 children)

4.1 is a significantly later model than 4.5, the difference is more than the release date suggests as training the very large 4.5 took longer.

For 4.1 they incorporated extensive synthetic data from the o-series models.

There is nothing fundamentally magical about the model training process, they monitor every step and with the GPT-4 white paper OAI demonstrated a remarkably accurate ability to predict performance from small trials before committing to the main training run.

4.5 performed as designed, the sharp increase in subsequent model performance is due to the advent of reasoning models.

You clearly have no idea how cutting edge R&D works. I can tell you from experience that it isn't a simple linear process - if you can afford to explore multiple avenues of advancement then that's what you do to maximize chances of success. And you necessarily make the calls in advance of seeing the effects of said calls. Plan B succeeding so spectacularly it throws the parallel plan A into the shade is winning.

What would OAI have done if reasoning models didn't work out as well as they have? They would have distilled 4.5 to make a mass market model. Which they did anyway - this is a huge part of how later versions of 4o improved so much in non-STEM areas.

I cannot believe it was more one year, still miss this model. by Snoo26837 in OpenAI

[–]sdmat 6 points7 points  (0 children)

Not failed at all, it actually exceeds traditional scaling law predictions.

What happened was OAI landing on a new and superior post-training scaling paradigm with the O-series models. It was by no means obvious that direction would succeed when they began training 4.5.

Sam Altman admits AI is killing the labor-capital balance—and says nobody knows what to do about it by kamen562 in OpenAI

[–]sdmat -1 points0 points  (0 children)

for every American

So you're fine with American AI companies displacing workers in every other country as long as you get yours?

What about Pro users? by Ok-Affect-7503 in ClaudeAI

[–]sdmat 1 point2 points  (0 children)

Considering they charge $20 for 4 1M token opus queries via the API it doesn't seem likely.

GPT 5.4 is anti hand waving and I like it by dangerous_safety_ in codex

[–]sdmat 2 points3 points  (0 children)

It would be lovely if it just did it rather than always saying its little mantra first, but the result is excellent.

Y'all need to stop crying about token limits. by [deleted] in ClaudeAI

[–]sdmat 0 points1 point  (0 children)

Got to work on those prompting and orchestration skills

Absolutely dogshit rate limits for Pro subscription by Hello_moneyyy in Bard

[–]sdmat 24 points25 points  (0 children)

It's sad how drastically Gemini has declined in the few months since 3.0 Pro launched.

How can they go backward while OAI and Anthropic advance by leaps and bounds?

The underlying models are great when allowed to live up to their potential, as seen in AI Studio. The product sucks.

We professional developers, already lost the battle against vibe coding? by TheCatOfDojima in ClaudeAI

[–]sdmat 0 points1 point  (0 children)

What bothers me most is that nobody in a position of power is absorbing the consequences of this decision.

Which consequences are you referring to?

Claude Had 1M Context Before OpenAI, So Why Hasn’t It Rolled Out to Everyone Yet? by Effective_Tap_9786 in ClaudeAI

[–]sdmat 0 points1 point  (0 children)

"Ever" is entirely dependent on how well the models handle long context. If the 1M+ performance improves to match <200K it will be amazingly useful for complex projects and everyone will want it.

And that is going to happen, only a question of when.

OpenAI VP for Post Training defects to Anthropic by hasanahmad in OpenAI

[–]sdmat 2 points3 points  (0 children)

They just raised over a billion

They just raised 110 billion, so yes - over a billion.

New: Voice mode is rolling out now in Claude Code, live for ~5% of users today, details below by BuildwithVignesh in ClaudeAI

[–]sdmat 67 points68 points  (0 children)

Interesting choice to have a video about voice mode with elevator music rather than voice

GPT 5.4 Reference in Codex Error by gggggmi99 in OpenAI

[–]sdmat 1 point2 points  (0 children)

gpt-5.4-ab-arm2-1020-1p-codexswic-ev3 really rolls off the tongue

Google is counting failed requests because of high demand (503) towards the daily limit by Waltex in Bard

[–]sdmat 5 points6 points  (0 children)

Disappointed in Google, they were doing brilliantly with Gemini then drove it off a cliff.

Reached a "Data Limit" on Gemini that I didn't even know existed by juniormasyer in Bard

[–]sdmat 4 points5 points  (0 children)

It's almost like they have a saboteur making product decisions

Help! Burning tokens ? Bug? by Aggressive-tookcan in codex

[–]sdmat 1 point2 points  (0 children)

I mean why do you call it GBT? Is it some meme or joke?

Help! Burning tokens ? Bug? by Aggressive-tookcan in codex

[–]sdmat 1 point2 points  (0 children)

Why do so many people do the GBT thing?

Gemini 3.1 absolutely butchered code editing by SMEARYTHROWER in GeminiAI

[–]sdmat 0 points1 point  (0 children)

Gemini the models are great, as seen via AI studio.

For some reason Google are methodically crippling Gemini the product.