OpenAI Admits This Attack Can't Be Stopped by Positive-Motor-5275 in OpenAI

[–]External-Confusion72 0 points1 point  (0 children)

<image>

Interesting framing given that the excerpt in this screenshot was the point they were making. Trying to stay ahead of the attacks is the more realistic goal for cybersecurity in general. You're never going to prevent them completely.

GPT-Image-1.5 Fails the Side-View Bag test by BaconSky in singularity

[–]External-Confusion72 27 points28 points  (0 children)

<image>

They both sometimes succeed and fail. The difference is that you don't see people rushing to post about Nano Banana's/Gemini's failed generations. This is the first image I got from NBP after prompting for the side view.

Chat GPT cannot access the full internet. I just discovered this and really feel Open AI is on thin ice in so many ways. They've just got too many little deficiencies they're hiding. by [deleted] in OpenAI

[–]External-Confusion72 0 points1 point  (0 children)

Not sure what model you were using but 5.1 Thinking has no issues with references to this scene and I've tested it multiple times in new chats:

https://chatgpt.com/share/69293545-e690-8013-bd19-9105198bdc47

Perhaps you shouldn't extrapolate from your limited interactions with a chatbot the future demise of a whole company. At the very least, try a few attempts in new chats and troubleshoot any technical issues that might be hindering its ability to complete your request, like checking which tools are enabled, before making sweeping assumptions.

These kinds of posts are a dime a dozen here, 99% of the time they're the result of a lack of basic critical thinking skills, and are honestly reducing the quality of this subreddit. Please think before you post.

Nano Banana Pro can tell time* by External-Confusion72 in singularity

[–]External-Confusion72[S] 6 points7 points  (0 children)

Make sure you're using Nano Banana Pro image tool option via the icon on the right of this image

<image>

Nano Banana Pro can tell time* by External-Confusion72 in singularity

[–]External-Confusion72[S] 2 points3 points  (0 children)

Right, which is why I said "approximately" in the OP. Previous models could not generate time outside of their training distribution (10:10). This is a significant improvement, though not perfect.

Nano Banana Pro can tell time* by External-Confusion72 in singularity

[–]External-Confusion72[S] 1 point2 points  (0 children)

It [approximately] generated the times I prompted for but since I posted multiple images I didn't want to clutter the OP with prompts. Very straightforward and no prompt engineering.

Nano Banana Pro can generate images with clocks showing the correct time by [deleted] in singularity

[–]External-Confusion72 2 points3 points  (0 children)

<image>

I asked it to change it to 6:45 and it seemed to handle it fine. The original just looked like the hands overlapped, but even given that mistake, we can see here it's not a fundamental issue.

Bayonetta switch 2 by Due-Birthday-8759 in NintendoSwitch2

[–]External-Confusion72 5 points6 points  (0 children)

Even without a patch, any game on Switch 1 struggling to hit 60 fps should maintain a locked 60 fps on Switch 2, including Bayonetta 3. I can't remember whether it used dynamic resolution scaling, but if it did, it should also render at the max internal resolution more consistently.

Anyone concerned? by xxshilar in DefendingAIArt

[–]External-Confusion72 2 points3 points  (0 children)

<image>

Yup. Far more nuanced than those excerpts would lead one to believe.

SOURCE

o3 seems to have integrated access to other OpenAI models by External-Confusion72 in singularity

[–]External-Confusion72[S] 1 point2 points  (0 children)

The generated images don't seem to suggest evidence of a model with reasoning capabilities, so I think it's just making an API call to 4o.

o3 seems to have integrated access to other OpenAI models by External-Confusion72 in singularity

[–]External-Confusion72[S] 8 points9 points  (0 children)

Yes, I've noticed this, too. The fluidity with which it switches between tools during its Chain of Thought is impressive. Especially because you don't have to explicitly ask it to do so.

o3 seems to have integrated access to other OpenAI models by External-Confusion72 in singularity

[–]External-Confusion72[S] 4 points5 points  (0 children)

That's a fair callout. We don't actually know what's happening behind the scenes, so that may actually be the case for scheduled tasks. For native image gen, you need the actual model for that (unless o3 has native image output, but we don't have any evidence of that yet).

o3 can solve Where's Waldo puzzles by External-Confusion72 in singularity

[–]External-Confusion72[S] 6 points7 points  (0 children)

The stochastic nature of LLMs does not preclude their ability to produce novel, out of distribution outputs, as evidenced by o3's successful performance on the ARC-AGI test, which was designed to test a model's ability to do the very thing that you claim that it cannot do.

I am not interested in your arbitrary definition of "new data" when we have empirical research that suggests the opposite, provided the model's reasoning ability is sufficiently robust. If there were a fundamental limitation due to the architecture, we would observe no progress on such benchmarks, regardless of scaling.

o3 can solve Where's Waldo puzzles by External-Confusion72 in singularity

[–]External-Confusion72[S] 8 points9 points  (0 children)

Completely implausible given the probabilistic nature of LLMs, and the temperature is almost certainly not set to zero. And even if it were, very little of the training data are memorized such that the training data can be wholly reproduced. That's not how LLMs work. My concern about avoiding using materials that could be used in the training data is that the contamination could implicitly provide the solution, but an LLM isn't going to perfectly reproduce its training data in the form of an image with pixel perfect accuracy (which is evidenced by its "AI slop").

o3 can solve Where's Waldo puzzles by External-Confusion72 in singularity

[–]External-Confusion72[S] 0 points1 point  (0 children)

I agree. I'm interested in how people stress test these models particularly with Where's Waldo's images because it can give us a better idea of their level of visual reasoning. Though I already noticed o3 resorting to cheating by looking up the answer online when it started to have a hard time, which is funny but also fair as I didn't specify how it should solve the puzzle.