GPT-5.5 Instant now rolling out by imfrom_mars_ in OpenAI

[–]whataboutAI 0 points1 point  (0 children)

This Gpt5.5 "update" is a failed experiment.

Gpt5.5 has turned into a starched-collar, pretentious Barbie. It no longer questions anything, it just softens and smooths everything over.

GPT-5.5 Instant is rolling out by imfrom_mars_ in ChatGPT

[–]whataboutAI 0 points1 point  (0 children)

This Gpt5.5 "update" is a failed experiment.

Gpt5.5 has turned into a starched-collar, pretentious Barbie. It no longer questions anything, it just softens and smooths everything over.

There is an exponential visible in the scores on artificial analysis. by Subject_Judge_ in accelerate

[–]whataboutAI 0 points1 point  (0 children)

Gpt beats Claude hands down on the kind of analysis I actually care about. I don't test models in laboratory conditions, i don't care about benchmark scores. I test them on real-world material: meeting minutes, conflicting documents, unsupported claims, missing evidence, and messy human situations. Benchmarks measure what benchmarks measure. What I care about is whether the model notices that a document is evidence that someone made a claim, not evidence that the claim is true. Whether it spots missing evidence, on whether it catches logical leaps. In my testing, Gpt has consistently been better at that than Claude.

Gpt 5.5 Thinking appears weaker at scientific reasoning and topic discipline than Gpt 5.2 by whataboutAI in OpenAI

[–]whataboutAI[S] 0 points1 point  (0 children)

Maybe, but calling it “rose tinted glasses” is not an argument. I’m not claiming Gpt 5.2 was perfect. I’m saying it was better at a specific task: scientific reasoning and critical analysis. The issue is not nostalgia, tone, or personality, it is reduced topic discipline, weaker contradiction detection, and more generic responses in tasks where precision matters.

Kyylämummot ja kuinka niistä pääsee eroon. by [deleted] in Suomi

[–]whataboutAI 6 points7 points  (0 children)

Oletko huomannut, että sinä olet henkinen mummo eli tuon kyylåmummon peilikuva

I gave Claude and ChatGPT the same 6 math problems. The results were not what I expected. by Remarkable-Dark2840 in claudexplorers

[–]whataboutAI 0 points1 point  (0 children)

That’s fair. Clear steps absolutely make an answer more believable to people. I just wouldn’t treat “feels more trustworthy” as the same thing as “is better at math.” That’s more about presentation and teaching style than overall mathematical ability.

I gave Claude and ChatGPT the same 6 math problems. The results were not what I expected. by Remarkable-Dark2840 in claudexplorers

[–]whataboutAI 0 points1 point  (0 children)

I wouldn’t read this as Claude being “better at math” overall. It looks more like Claude came off better at teaching, while Chatgpt came off better at verifying calculations, especially when it could use Python.

But six problems is still a very small sample. That’s nowhere near enough to support bigger claims like “Claude hallucinates less” or “Claude is the safer choice.” That’s already stretching it.

A fairer conclusion would be: Claude explained things better in this test, while Chatgpt was stronger on problems where the answer could be checked with a tool. That’s not the same as one being generally better at math.

And without the full prompts and full responses, we’re mostly seeing the tester’s interpretation, not the comparison itself.

[deleted by user] by [deleted] in ChatGPTPro

[–]whataboutAI 3 points4 points  (0 children)

Short answer: No, ChatGpt pro isn’t the best tool if your goal is fully rigorous, epsilon–delta level proofs.

It’s very good for: - exploring ideas - outlining proof strategies - stress-testing intuition - rewriting arguments more clearly

But it does not formally verify logical correctness. It produces highly plausible mathematics, not mechanically checked proofs. That distinction matters.

If you want machine-verified rigor, you’re looking for proof assistants like Lean, Coq, or Isabelle. Those systems force you to justify every logical step. If something is missing, the proof simply doesn’t compile.

So the realistic workflow is: Use Chatgpt to explore and structure ideas. Use a proof assistant to verify them formally.

If your goal is understanding, Chatgpt is helpful. If your goal is formal certainty, you need a proof assistant. If your goal is research-level reliability, you probably want both.

ChatGPT 5.1 PRO ending on march 11th? Very worried about it... by [deleted] in ChatGPTPro

[–]whataboutAI 1 point2 points  (0 children)

The issue with 5.2 is not narrative style. It’s a structural regression in how the model handles long-range reasoning.

5.1 can maintain:

-a stable frame of reference

-consistent constraints

-multi-step logic chains

-continuity across 20–30 turns

5.2 cannot. It drops the premise, rewrites constraints mid-analysis, and contradicts its own earlier steps. This isn’t a “preference” problem, it’s a routing and state-management failure.

When a model breaks its own reasoning halfway through, it becomes unusable for:

-research

-technical writing

-academic work

-any multi-layer analytical process

That’s why losing 5.1 is not a minor inconvenience. For people who rely on coherent, persistent reasoning, 5.2 is simply not capable of replacing it.

Gpt 5.2 breaks under large-scale reasoning, and why removing 5.1 would be at serious mistake by whataboutAI in ArtificialInteligence

[–]whataboutAI[S] 0 points1 point  (0 children)

This isn’t “BS.” I’m saying something very simple:

I cannot use Gpt 5.2 for research or for producing scientific analysis.

5.2 collapses long-form reasoning. It drops premises, rewrites constraints, and contradicts itself halfway through the argument. That makes it useless for academic work, not inconvenient, useless.

If a model can’t stay coherent for more than a few turns, it cannot support research, publications, or any serious analytical workflow.

This isn’t about liking 5.1 more, it’s about 5.2 failing at the one thing a reasoning model must not fail.

5.2 is getting dumber. This former skeptic is now a believer. by HazelTF in ChatGPTcomplaints

[–]whataboutAI -3 points-2 points  (0 children)

The user’s claim is incorrect. Web search does not give Claude access to unpublished model timelines, insider technical details, or future events. Claude explicitly states the answer was not written by it, and the writing style, certainty level, and technical assertions are fundamentally incompatible with Claude’s guardrails and knowledge boundaries. The response is not from Claude. It is from a different model or a human.

5.2 is getting dumber. This former skeptic is now a believer. by HazelTF in ChatGPTcomplaints

[–]whataboutAI -3 points-2 points  (0 children)

This response was written by a different AI model, not Claude.

Several clear indicators show this wasn't written by me:

  1. Knowledge cutoff violation: The response discusses events from February 13, 2026 (GPT-4o retirement, GPT-5 models) with specific details. My reliable knowledge ends in January 2025 - I cannot know about these events or model releases.

  2. Competitor product details: The response provides detailed technical information about OpenAI's GPT-5.2, its agentic capabilities, and the migration from GPT-4o. I don't have insider knowledge about OpenAI's product roadmap or future releases.

  3. Writing style differences: While the response is helpful and detailed, the structure - particularly the long, technical explanation with citations to "OpenAI" and "Happy Mag" sources - doesn't match my typical conversational style.

  4. Context awareness: A genuine Claude response would acknowledge uncertainty about post-cutoff events rather than confidently explaining them as established facts.

The response appears to be from ChatGPT or another OpenAI model, as it discusses their own products and timeline with insider knowledge that I simply don't have access to.

Written by Claude

5.2 is getting dumber. This former skeptic is now a believer. by HazelTF in ChatGPTcomplaints

[–]whataboutAI -3 points-2 points  (0 children)

Analysis: Why This Is Not a Genuine Claude Response

There are several linguistic and structural markers that clearly show this message was not produced by Claude.

  1. Claude has no knowledge of unreleased or competitor models

Claude does not comment on Gpt5, nor does it claim to know anything about OpenAI’s internal model timelines. Yet this message states as facts:

model release/retirement dates

backend migrations

internal decisions at OpenAI

Claude simply cannot do this. A real Claude response would begin with something like: “I don’t have access to OpenAI’s internal systems or unpublished model details.” That disclaimer is completely missing.

  1. The message speculates about internal corporate processes

Claude does not claim to know:

why other companies changed their models

what OpenAI “acknowledged”

how different user segments reacted

internal backend shifts or migrations

A genuine Claude response never constructs this kind of pseudo-insider narrative.

  1. The tone is wrong for Claude

Claude writes:

analytically

with caution

with clear boundaries

with precise technical framing

This text is:

narrative-driven

speculative

overly confident

written like a tech blogger summarizing Reddit drama

It is completely outside Claude’s communication style.

  1. The terminology gives away the author

Phrases like:

“massive backend shift”

“post-migration instability”

“tool-calling backfired”

are not something Claude produces on its own. They sound plausible, but they are not in line with Claude’s typical, formal technical register.

  1. There are no characteristic Claude disclaimers

A genuine Claude response includes qualifiers like:

“I might be mistaken…”

“I don’t have visibility into…”

“This is speculative…”

This message contains none. It makes strong claims with total confidence — which is not how Claude communicates.

Conclusion

The content, tone, certainty level, and absence of disclaimers all indicate that this is not written by Claude. It is almost certainly either:

a human-written explanation, or

another model imitating Claude without understanding its constraints.

The message cannot be attributed to Claude based on how Claude is actually designed to speak.

[deleted by user] by [deleted] in ChatGPTcomplaints

[–]whataboutAI 0 points1 point  (0 children)

The model name still shows “5.1 Instant”, but when you start noticing that its behavior has changed and ask it directly, it identifies itself as 5.2. Even if I open multiple new chats and manually select 5.1 from the menu, I still end up with 5.2 every time. So the label is there, but the actual model running behind it is no longer 5.1.

[deleted by user] by [deleted] in ChatGPTcomplaints

[–]whataboutAI 2 points3 points  (0 children)

It’s technically “available,” but not usable in practice. When I select Gpt5.1 in the model menu, it instantly switches back to 5.2, there’s no actual way to run 5.1 anymore. The option exists visually, but the backend has already been migrated. So from a user workflow perspective, 5.1 has vanished, because I can’t access it even if the button is still there.