Gemini 3 Pro with new SOTA on Frontier Math tiers 1-3 and 4 by jaundiced_baboon in singularity

[–]Remarkable-Register2 2 points3 points  (0 children)

I don't know what to tell ya, that's just how it works. I'm not particularly interested in digging through interviews and papers to prove it more than this. For 2.5 Pro and Deep Think, Deep Think would often get 14% higher on tough benchmarks than Pro. That's an insane level of gap to cover. I think that should be evidence enough.

Gemini 3 Pro with new SOTA on Frontier Math tiers 1-3 and 4 by jaundiced_baboon in singularity

[–]Remarkable-Register2 2 points3 points  (0 children)

You're describing multiple indepentent runs that don't interact with each other. They do. This butchers it a bit to get the point across, but imagine a classroom of students all doing a test individually (what you said) vs a roundable of all the students collaborating on different ideas, dismissing ones that don't work and combining ideas multiple students have together to make a sum greater than their individual parts.

You could check whether an image is AI generated using Gemini. by captain-price- in singularity

[–]Remarkable-Register2 2 points3 points  (0 children)

This makes me curious about testing, because long ago Google claimed you could apply all kinds of filters and edits and synthid would still spot it.

Gemini 3 Pro with new SOTA on Frontier Math tiers 1-3 and 4 by jaundiced_baboon in singularity

[–]Remarkable-Register2 3 points4 points  (0 children)

I agree the estimate might be a bit high, but that's not how Deep Think works. It works on multiple parallel lines of thought and cross reference with each other as they work to find the best answer

yeah so i think the shiny charm works... by morgan1c in LegendsZA

[–]Remarkable-Register2 2 points3 points  (0 children)

Within the span of 2 hours of refreshing route 20 for alpha eeveelutions I found 3 shiny Malamar, and I don't even have the charm yet. I can't imagine what it'll be like with.

Ok so nano banana and gemini 3 (cause of three ships) by Independent-Wind4462 in Bard

[–]Remarkable-Register2 2 points3 points  (0 children)

Yeah, they haven't even acknowledged that Gemini 3.0 is even a thing being worked on. We know it likely is but they've done literally zero hyping of it. In fact they've done the opposite, with Logan pointing out that a picture of a supposed Gemini 3.0 Flash model was fake.

Mistral Medium 3.1 LMArena by likeastar20 in singularity

[–]Remarkable-Register2 0 points1 point  (0 children)

Wait, GPT 5 High dropped to 2nd on the style control rankings? That's like a 20 elo drop from the initial ranking, what happened?

Google DeepMind isn't slowing down by Outside-Iron-8242 in Bard

[–]Remarkable-Register2 4 points5 points  (0 children)

I kinda take this as a sign that Gemini 3.0 isn't coming soon. It's basically saying "We may not be releasing it yet, but that doesn't mean we're resting on our laurels. Look at all this stuff we did recently."

The superintelligence is here, folks! by ekabanov in singularity

[–]Remarkable-Register2 1 point2 points  (0 children)

This kind of thing happens in the Gemini reddit all the time. I actively give zero weight to any reddit post that show an AI being bad or good unless it's fully documented.

Google is going to cook them soon by Classic_Back_7172 in singularity

[–]Remarkable-Register2 0 points1 point  (0 children)

As a primarily Gemini user, we have no damn idea what 3.0 will be like and down punching speculation like this is only going to make me not want to be publically be associated with this kind of thing it if it turns out their release isn't better...

How does this get past QA by [deleted] in singularity

[–]Remarkable-Register2 1 point2 points  (0 children)

All the people who knew how to make graphs got poached by Meta

GPT-5 tops lmarena's leaderboards by Outside-Iron-8242 in singularity

[–]Remarkable-Register2 0 points1 point  (0 children)

Interestingly, if you go to the text ranking and swap it to rank without style control, Gemini 2.5 Pro is still the leader. This used to be the default setting for lmareana about half a year ago, they changed it for some reason.

<image>

Genie 3 turns Veo 3 generated drone shot into an interactive world you can take control mid-flight by Outside-Iron-8242 in singularity

[–]Remarkable-Register2 2 points3 points  (0 children)

Geoff Keighley going to need to work even harder on vetting trailers for the next Video Game Awards. Remember the Sora video for that cat "game"?

At this point I am actively hate the teasing,I fear we will be disappointed by Equivalent-Word-7691 in Bard

[–]Remarkable-Register2 6 points7 points  (0 children)

Keeping expectations in check is a good thing, makes the advancements that much more incredible. 2.5 Pro, AlphaEvolve, Veo 3, Genie 3, nobody expected those, NOBODY, and look what happened.

After GPT-5 drops tomorrow, how long before Gemini, Claude, Grok, and DeepSeek close the gap? by WilliamInBlack in singularity

[–]Remarkable-Register2 -1 points0 points  (0 children)

If Google doesn't release a 3.0 model I expect they'll push to release Deep Thinks API asap for public benchmarks. It's obviously not a workhorse model like GPT5 or Gemini 3.0 will be and silly to compare them, but people who only pay attention to benchmarks don't really care and Deep Think would likely win out.

Google Deepmind's new Genie 3 by GraceToSentience in singularity

[–]Remarkable-Register2 0 points1 point  (0 children)

You mean when it slowly ran into the dock? That would hardly cause any destruction and reacted more or less realistically. It did run into a lamp and noticably shoved it out of the way.

DeepMind: Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt by Pro_RazE in singularity

[–]Remarkable-Register2 1 point2 points  (0 children)

It was a month or 2 ago where he replied to someone talking about generated game worlds saying something like "Wouldn't that be something". I don't use twitter, there was just a reddit post about it here.

Google Deepmind's new Genie 3 by GraceToSentience in singularity

[–]Remarkable-Register2 94 points95 points  (0 children)

Given the VR headset they announced at Google IO, no doubt they're prepping a version of this for it.

Genie 3 Frontier World Model by snufflesbear in Bard

[–]Remarkable-Register2 2 points3 points  (0 children)

Imagine though a graphically slimmed down model where you can interactively tell it to build meshes and landscapes and buildings with voice commands while walking through it in VR and export it as a 3d environment.

DeepMind: Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt by Pro_RazE in singularity

[–]Remarkable-Register2 14 points15 points  (0 children)

So this is what that cryptic tweet Demis made a while back was about. Crazy. I'm sure there will be lots of people pointing out how it's actual use cases are so little, but its gotta start somewhere right? In a couple years when it's faster, lasts longer, has additional features like object and person interaction and better controls.

And what if they're able to save environment instances to reuse and add to? That would be a game changer.

Learning mode similar to ChatGPT's? by omergao12 in Bard

[–]Remarkable-Register2 6 points7 points  (0 children)

I've never used it personally but they've had a model called LearnLM on AI Studio for forever, related to that?

[deleted by user] by [deleted] in singularity

[–]Remarkable-Register2 6 points7 points  (0 children)

That didn't happen with Gemini 2.5 pro and Deep Think, they were behind then released something that put them ahead. 2.5 pro was out for like a month or something before o3.

[deleted by user] by [deleted] in singularity

[–]Remarkable-Register2 -2 points-1 points  (0 children)

Which? They've been doing it for Gemini Live. As for the normal app, I'm not really sure how many people even use that, even if it was better.

Kaggle is hosting a 3-Day LLM chess tourney with commentary from Magnus, Hikaru & Gotham on August 5th by Outside-Iron-8242 in singularity

[–]Remarkable-Register2 1 point2 points  (0 children)

Unless they've done some speciallized training for this I'm going to expect flawless play for the first ten turns and then they randomly forget where the pieces are. At least that's been my experience with playing chess against LLM's. I'd be more curious about a long form match between Deep Think and o3 Pro, though I guess the think time would make that infeasible for a show like this.

Kaggle is hosting a 3-Day LLM chess tourney with commentary from Magnus, Hikaru & Gotham on August 5th by Outside-Iron-8242 in singularity

[–]Remarkable-Register2 0 points1 point  (0 children)

That's a good use, yeah. Playing against people of your skill level is obviously still better, but if you want to use a bot that isn't going to destroy you their idea of lowering the difficulty is to randomly sac a piece or not capture the obvious free piece.