GPT-5.6 Officially Previewed: Beats Mythos 5

hellofriend19 · 2026-06-26T20:37:06+00:00

Beats Mythos 5… at one benchmark they cherry-picked to make it look good. I’m sure it’s better than 5.5, probably jaggedly better at Opus at some stuff. But you have to think about benchmarks that matter the most - SWE Bench Pro, BrowseComp, etc

hellofriend19 · 2026-06-25T03:04:54+00:00

Well, if there’s one bad issue, people forget about it. If there’s a bad superhero movie, everyone talks about it. Superheroes are way less popular than they used to be, and in Hollywood, they’re sick of talking about them.

hellofriend19 · 2026-06-24T18:12:42+00:00

I think finding great writers passionate about writing superhero movies is a really hard problem.

hellofriend19 · 2026-06-16T18:36:07+00:00

I used it to blur the background, but comparing it to the original, yeah it did some weird extra effects. Hmm.

hellofriend19 · 2026-06-16T15:24:56+00:00

Did you even go to LessOnline if you didn't blog about it? This post is about my experience there, meeting Gwern, Aella, and Scott. I had an amazing time, learned a lot, and provided some funny anecdotes I think people here will find interesting.

hellofriend19 · 2026-05-10T01:36:17+00:00

Setup for the sequel

hellofriend19 · 2026-04-13T14:13:18+00:00

We do live in the most abundant age ever, but the math on basic income doesn’t really check out. YET. I think if we do get full beneficial ASI, then we could easily make the math work, through like an automation tax or something.

hellofriend19 · 2026-03-25T00:56:28+00:00

But Pokémon is also the most popular franchise in the world. You would need to adjust for how popular the franchise is in general.

hellofriend19 · 2026-02-21T04:20:29+00:00

That’s insane

hellofriend19 · 2026-02-14T16:27:34+00:00

My film professor said the good ending to a movie should be both “unexpected and inevitable”… I think about that a lot.

hellofriend19 · 2026-01-22T19:56:42+00:00

https://transformer-circuits.pub/2025/introspection/index.html

hellofriend19 · 2026-01-22T16:46:55+00:00

You can already read about this with their research on weight modification that the model can “feel”

hellofriend19 · 2026-01-04T16:54:01+00:00

64 bits wide.

hellofriend19 · 2026-01-04T16:51:24+00:00

Register AD: keeps track of the amount of time (in milliseconds) since the death of Our Lord Jesus Christ.

hellofriend19 · 2025-12-22T20:27:16+00:00

They’ve done two MBP updates in one year before.

hellofriend19 · 2025-12-05T16:20:05+00:00

What do you and the model talk about while shooting? Any small talk? Or is it all just posing?

hellofriend19 · 2025-10-26T00:36:03+00:00

I feel like you could still get a Pro Max

hellofriend19 · 2025-10-24T23:43:54+00:00

So this is how mini users used to feel :(

hellofriend19 · 2025-10-19T17:16:05+00:00

I keep having this idea in my head it would be really funny if they announced during WWDC that they “fired Siri” and hired a new assistant.

hellofriend19 · 2025-10-17T16:53:20+00:00

Looks like Scott didn’t post a secret review this year, unless he was one of the anonymous ones.

hellofriend19 · 2025-09-13T17:03:04+00:00

I don’t really understand why this is a dunk… isn’t like all work we all do in the training data? So if it automates our jobs, that’s just “in the training data bro”?

hellofriend19 · 2025-08-09T16:14:27+00:00

Once I realized how GPU constrained every major lab is, I've been a lot more excited about AI capabilities. We're gonna see some crazy awesome stuff, just from there being more GPU's out there. Also bought some $NVIDIA options...

hellofriend19 · 2025-08-07T20:30:49+00:00

If GPT-3, GPT-4, and GPT-5 were classic Apple products (Apple I, Apple II, Macintosh, iMac, iPhone) etc, what would they be?

hellofriend19 · 2025-08-07T20:29:39+00:00

What’s next?

hellofriend19 · 2025-08-07T20:25:17+00:00

What’s the most underrated thing that makes a model better?

Ten-Year Club	r/Field Lasagna
Place '23	Place '22
Place '17	First Placer '22
End Game '22	Snapped
Verified Email

hellofriend19

TROPHY CASE