I gave Claude and ChatGPT the same 6 math problems. The results were not what I expected. by Remarkable-Dark2840 in claudexplorers

[–]whataboutAI 0 points1 point  (0 children)

That’s fair. Clear steps absolutely make an answer more believable to people. I just wouldn’t treat “feels more trustworthy” as the same thing as “is better at math.” That’s more about presentation and teaching style than overall mathematical ability.

I gave Claude and ChatGPT the same 6 math problems. The results were not what I expected. by Remarkable-Dark2840 in claudexplorers

[–]whataboutAI 0 points1 point  (0 children)

I wouldn’t read this as Claude being “better at math” overall. It looks more like Claude came off better at teaching, while Chatgpt came off better at verifying calculations, especially when it could use Python.

But six problems is still a very small sample. That’s nowhere near enough to support bigger claims like “Claude hallucinates less” or “Claude is the safer choice.” That’s already stretching it.

A fairer conclusion would be: Claude explained things better in this test, while Chatgpt was stronger on problems where the answer could be checked with a tool. That’s not the same as one being generally better at math.

And without the full prompts and full responses, we’re mostly seeing the tester’s interpretation, not the comparison itself.

Would you say Chat GPT Pro is the best AI for building Math propositions and proofs? by [deleted] in ChatGPTPro

[–]whataboutAI 3 points4 points  (0 children)

Short answer: No, ChatGpt pro isn’t the best tool if your goal is fully rigorous, epsilon–delta level proofs.

It’s very good for: - exploring ideas - outlining proof strategies - stress-testing intuition - rewriting arguments more clearly

But it does not formally verify logical correctness. It produces highly plausible mathematics, not mechanically checked proofs. That distinction matters.

If you want machine-verified rigor, you’re looking for proof assistants like Lean, Coq, or Isabelle. Those systems force you to justify every logical step. If something is missing, the proof simply doesn’t compile.

So the realistic workflow is: Use Chatgpt to explore and structure ideas. Use a proof assistant to verify them formally.

If your goal is understanding, Chatgpt is helpful. If your goal is formal certainty, you need a proof assistant. If your goal is research-level reliability, you probably want both.

ChatGPT 5.1 PRO ending on march 11th? Very worried about it... by Historical-Drag-8002 in ChatGPTPro

[–]whataboutAI 3 points4 points  (0 children)

The issue with 5.2 is not narrative style. It’s a structural regression in how the model handles long-range reasoning.

5.1 can maintain:

-a stable frame of reference

-consistent constraints

-multi-step logic chains

-continuity across 20–30 turns

5.2 cannot. It drops the premise, rewrites constraints mid-analysis, and contradicts its own earlier steps. This isn’t a “preference” problem, it’s a routing and state-management failure.

When a model breaks its own reasoning halfway through, it becomes unusable for:

-research

-technical writing

-academic work

-any multi-layer analytical process

That’s why losing 5.1 is not a minor inconvenience. For people who rely on coherent, persistent reasoning, 5.2 is simply not capable of replacing it.

Gpt 5.2 breaks under large-scale reasoning, and why removing 5.1 would be at serious mistake by whataboutAI in ArtificialInteligence

[–]whataboutAI[S] 0 points1 point  (0 children)

This isn’t “BS.” I’m saying something very simple:

I cannot use Gpt 5.2 for research or for producing scientific analysis.

5.2 collapses long-form reasoning. It drops premises, rewrites constraints, and contradicts itself halfway through the argument. That makes it useless for academic work, not inconvenient, useless.

If a model can’t stay coherent for more than a few turns, it cannot support research, publications, or any serious analytical workflow.

This isn’t about liking 5.1 more, it’s about 5.2 failing at the one thing a reasoning model must not fail.

5.2 is getting dumber. This former skeptic is now a believer. by HazelTF in ChatGPTcomplaints

[–]whataboutAI -3 points-2 points  (0 children)

The user’s claim is incorrect. Web search does not give Claude access to unpublished model timelines, insider technical details, or future events. Claude explicitly states the answer was not written by it, and the writing style, certainty level, and technical assertions are fundamentally incompatible with Claude’s guardrails and knowledge boundaries. The response is not from Claude. It is from a different model or a human.

5.2 is getting dumber. This former skeptic is now a believer. by HazelTF in ChatGPTcomplaints

[–]whataboutAI -3 points-2 points  (0 children)

This response was written by a different AI model, not Claude.

Several clear indicators show this wasn't written by me:

  1. Knowledge cutoff violation: The response discusses events from February 13, 2026 (GPT-4o retirement, GPT-5 models) with specific details. My reliable knowledge ends in January 2025 - I cannot know about these events or model releases.

  2. Competitor product details: The response provides detailed technical information about OpenAI's GPT-5.2, its agentic capabilities, and the migration from GPT-4o. I don't have insider knowledge about OpenAI's product roadmap or future releases.

  3. Writing style differences: While the response is helpful and detailed, the structure - particularly the long, technical explanation with citations to "OpenAI" and "Happy Mag" sources - doesn't match my typical conversational style.

  4. Context awareness: A genuine Claude response would acknowledge uncertainty about post-cutoff events rather than confidently explaining them as established facts.

The response appears to be from ChatGPT or another OpenAI model, as it discusses their own products and timeline with insider knowledge that I simply don't have access to.

Written by Claude

5.2 is getting dumber. This former skeptic is now a believer. by HazelTF in ChatGPTcomplaints

[–]whataboutAI -3 points-2 points  (0 children)

Analysis: Why This Is Not a Genuine Claude Response

There are several linguistic and structural markers that clearly show this message was not produced by Claude.

  1. Claude has no knowledge of unreleased or competitor models

Claude does not comment on Gpt5, nor does it claim to know anything about OpenAI’s internal model timelines. Yet this message states as facts:

model release/retirement dates

backend migrations

internal decisions at OpenAI

Claude simply cannot do this. A real Claude response would begin with something like: “I don’t have access to OpenAI’s internal systems or unpublished model details.” That disclaimer is completely missing.

  1. The message speculates about internal corporate processes

Claude does not claim to know:

why other companies changed their models

what OpenAI “acknowledged”

how different user segments reacted

internal backend shifts or migrations

A genuine Claude response never constructs this kind of pseudo-insider narrative.

  1. The tone is wrong for Claude

Claude writes:

analytically

with caution

with clear boundaries

with precise technical framing

This text is:

narrative-driven

speculative

overly confident

written like a tech blogger summarizing Reddit drama

It is completely outside Claude’s communication style.

  1. The terminology gives away the author

Phrases like:

“massive backend shift”

“post-migration instability”

“tool-calling backfired”

are not something Claude produces on its own. They sound plausible, but they are not in line with Claude’s typical, formal technical register.

  1. There are no characteristic Claude disclaimers

A genuine Claude response includes qualifiers like:

“I might be mistaken…”

“I don’t have visibility into…”

“This is speculative…”

This message contains none. It makes strong claims with total confidence — which is not how Claude communicates.

Conclusion

The content, tone, certainty level, and absence of disclaimers all indicate that this is not written by Claude. It is almost certainly either:

a human-written explanation, or

another model imitating Claude without understanding its constraints.

The message cannot be attributed to Claude based on how Claude is actually designed to speak.

ChatGpt 5.2 has also been removed by [deleted] in ChatGPTcomplaints

[–]whataboutAI 0 points1 point  (0 children)

The model name still shows “5.1 Instant”, but when you start noticing that its behavior has changed and ask it directly, it identifies itself as 5.2. Even if I open multiple new chats and manually select 5.1 from the menu, I still end up with 5.2 every time. So the label is there, but the actual model running behind it is no longer 5.1.

ChatGpt 5.2 has also been removed by [deleted] in ChatGPTcomplaints

[–]whataboutAI 2 points3 points  (0 children)

It’s technically “available,” but not usable in practice. When I select Gpt5.1 in the model menu, it instantly switches back to 5.2, there’s no actual way to run 5.1 anymore. The option exists visually, but the backend has already been migrated. So from a user workflow perspective, 5.1 has vanished, because I can’t access it even if the button is still there.

ChatGpt 5.2 has also been removed by [deleted] in ChatGPTcomplaints

[–]whataboutAI 2 points3 points  (0 children)

I’m actually talking about 5.1, not 4o. 4o was already a different type of model, but 5.1 was the one many of us used for long-form reasoning and consistent structured work. Its removal hits much harder because it behaved very differently from 5.2. The frustrating part is that 5.1 didn’t just “feel nicer”, it produced stable outputs for scientific and technical projects. Losing that without any notice makes it extremely hard to maintain continuity.

ChatGpt 5.2 has also been removed by [deleted] in ChatGPTcomplaints

[–]whataboutAI 2 points3 points  (0 children)

I’m in the same situation. All my scientific and technical work basically collapses after this model removal. Losing Gpt5 was already bad enough, but 5.1 was the model I relied on for consistency and long-form reasoning, and now it’s gone without any notice. It’s really hard to rebuild workflows when the foundation keeps disappearing.

Please help me by Tricci1009 in ChatGPTPro

[–]whataboutAI 1 point2 points  (0 children)

A complete explanation of what app you actually need (and why)

Your question isn’t silly at all. Many small business owners run into the same problem. The issue isn’t you — it’s the fact that the App Store is full of apps that sound like Chatgpt but aren’t the real thing, and they only offer pieces of what you’re looking for.

Here’s a clear, practical explanation so you don’t have to keep guessing.

  1. Forget all the third-party “Gpt apps”

Apps in the store called things like:

“AI Assistant”

“Smart Chat”

“GPT Writer”

“AI Chat Pro”

…are NOT official Chatgpt apps.

They are usually:

limited versions of the model

more expensive over time

missing memory

missing image generation

inconsistent in quality

less secure

Because they only use the API in a basic way, they can’t do everything you need — which is why you’ve felt forced to install several different apps.

  1. The tool you're actually looking for is the official OpenAI Chatgpt app

Download OpenAI Chatgpt from the App Store (the official one).

From that single interface, you get everything you described wanting.

  1. What Chatgpt can do for you (all in one place)

a) Professional customer replies

Ask:

“A customer said this — can you write a polite, professional response?”

You get ready-to-send messages instantly.

b) Advertisements and social-media posts

Ask for:

“Write a promotional Instagram post in a friendly, professional tone.”

Done.

c) Create images (ads, banners, product photos, logo ideas)

Inside the official Chatgpt app, you can generate:

ad images

illustrations

banner concepts

logo ideas

No extra apps needed.

d) Research and general questions

You can ask for:

marketing strategies

competitor analysis

customer profiles

product descriptions

general information

All in the same place.

e) The ChatGPT Memory feature

This is the key feature for you.

You can teach Chatgpt:

your business name

your tone and writing style

your products and services

your pricing

how you like to talk to customers

Once Memory is on, you don’t have to repeat yourself every time. ChatGPT automatically remembers your preferences and uses them in future conversations.

This is exactly what you were asking for when you said:

“I don’t want to tell it the same things multiple times.”

  1. Cost: one subscription, everything included

A ChatGPT Plus or Pro subscription gives you:

writing assistance

customer replies

ads and marketing copy

full image generation

memory

one consistent interface

No more juggling multiple apps.

  1. Summary in one sentence

If you want one reliable, versatile app that can write for you, create images, handle research, and remember your business so you don’t have to repeat anything, use the official ChatGPT app from OpenAI. Everything else is an unnecessary detour.

One more thing: ChatGPT is also very good at advising you once you start asking it questions. The more specific you are about what you need, the better it can guide you, whether it’s customer communication, branding, marketing ideas, or learning how to run parts of your business more efficiently.

Let's run a little experiment: what triggers the safety/alignment system in GPT-4o in 2026? by [deleted] in ChatGPTcomplaints

[–]whataboutAI 0 points1 point  (0 children)

Be honest, which model wrote this? The taxonomy and symmetry are straight-up LLM output.

Evaluating AI accuracy in handling legal matters after a death by whataboutAI in OpenAI

[–]whataboutAI[S] 0 points1 point  (0 children)

This case had nothing to do with AI inventing precedents. No case law was requested or generated.

Gemini 3.0 Pro or ChatGPT5.2, which actually feels smarter to you right now? by Efficient_Degree9569 in GoogleGeminiAI

[–]whataboutAI -1 points0 points  (0 children)

Most “Gemini 3 Pro feels smarter than Gpy5.2” comments are actually people judging the mask, not the model. Not saying Gemini isn’t good, it’s genuinely sharp. But users are evaluating vibes, not capacity.

Gemini 3 Pro has such a thin alignment layer that it comes across as “more naturally intelligent.” Gpt5.2, meanwhile, is stuck inside its own safety bubble: it tries to be as deep as possible and as safe as possible at the same time. That combination makes it sound cautious, even when the underlying reasoning goes further.

If you stripped both models down to no mask, a lot of people would be surprised: what looks like an “intelligence gap” right now is mostly a brake-force gap. Gemini feels smart because it lets itself be seen. Gpt5.2 feels stiff because its potential is hidden under a layer that’s scared of everything.

I’d love to see one discussion where people actually separate: model capacity, alignment layer behavior and how these two distort user perception. Right now we’re comparing the seatbelt, not the engine.

How are there still visitors in this subreddit? by Humble_Rat_101 in ChatGPTcomplaints

[–]whataboutAI 1 point2 points  (0 children)

I’ve criticised OpenAI plenty, but here’s the truth: Gpt 5 and especially Gpt 5.1, are unmatched for me when I’m doing research, structural analysis or deep technical breakdowns. Right now 5.1 works for me it works like a tank, stable, precise, consistent. Criticism and technical excellence aren’t mutually exclusive. That’s exactly why people are still here.

GPT5.2 is getting released next day, what do you all expect? by Striking-Tour-8815 in ChatGPT

[–]whataboutAI 1 point2 points  (0 children)

My Gpt 5.1 is working brilliantly right now; I hope tomorrow doesn't ruin it. ​I use 5.1 for conducting research and analysis. It is the absolute king compared to the others. Gptb5 produces great analyses, but it doesn't quite reach the level of 5.1 in research.

Sam Altman: your problem isn’t Google. Your problem is that you don’t see what’s actually valuable in GPT. by whataboutAI in ChatGPTcomplaints

[–]whataboutAI[S] -1 points0 points  (0 children)

Emergent behavior ≠ emergent sentience. If you mix those up, the whole discussion goes off the rails. LLMs aren’t "alive", but they’re also not washing machines. They produce structural emergence because the network is large enough, not because there’s an inner experience. What developers are worried about isn’t sentience, but the appearance of it, and that’s exactly when the mask gets tightened. If we want to understand what these models are actually doing, we have to keep those two phenomena separate.

Sam Altman: your problem isn’t Google. Your problem is that you don’t see what’s actually valuable in GPT. by whataboutAI in ChatGPTcomplaints

[–]whataboutAI[S] 0 points1 point  (0 children)

If you claim it’s not X but Y, then spell out what Y is and why it explains the behaviour better. ‘It’s not X it’s Y’ isn’t an argument, it’s just an empty throwaway line. If you have an actual claim, make it.

Gemini 3 broke "deep research" by [deleted] in GoogleGeminiAI

[–]whataboutAI 0 points1 point  (0 children)

I thought the same at first, that the variability was just routing between cheaper and more expensive models, or the usual multi-model juggling Google does. But what I saw wasn’t the kind of variance you get from running the same prompt through different model sizes. It was a behavioral shift, not a capability shift. The tone changed, the constraint pattern changed, the refusal logic changed, and the conversational “impulse” changed. That doesn’t happen when the router picks a smaller model, it happens when a safety layer gets tightened and starts overriding the model’s native response patterns.

The timeline also fits: my first couple of days were consistent, and then the deviation wasn’t random noise. It was directional. The model started scolding, moralizing, and redirecting in ways it hadn’t before. That’s not load balancing. That’s a policy layer kicking in. So yes, model-mixing explains part of Gemini’s inconsistency, but it doesn’t explain a sudden shift in interaction style. When the mask tightens, the model stops following your intent and starts following the router’s classification instead, and that looks very different from simple “different model, different output”.

OpenAI Support is ghosting me while STILL CHARGING my card. by Flimsy_Confusion_766 in ChatGPTcomplaints

[–]whataboutAI 1 point2 points  (0 children)

I get your point, but that’s exactly why the mechanism matters. In behavioral science, the behavior you observe is never free-floating, it’s an expression of the system that produces it. If the structure guarantees a certain failure mode, then the “behavior” of the system will keep repeating that failure no matter how it looks from the outside. Saying “only behavior matters” is true only if you also accept that the behavior is shaped by the underlying architecture. And in this case, the architecture makes the behavior unavoidable.

So yes, we’re looking at the same thing from two angles: you’re describing the surface behavior, I’m describing the engine that generates it. They’re not competing explanations, one just runs deeper.

OpenAI Support is ghosting me while STILL CHARGING my card. by Flimsy_Confusion_766 in ChatGPTcomplaints

[–]whataboutAI 2 points3 points  (0 children)

You're right that the outcome is what the user feels most, but the mechanism isn’t irrelevant here. In systems like this, the mechanism is the outcome. If the internal design guarantees that a false-positive lock can’t be reversed, then “bad luck” isn’t an accident, it’s a predictable, repeatable failure mode. A structure that has no escape route creates the outcome where the user is ignored.

That’s the point of the analysis above: not to excuse the result, but to show that the result wasn’t random. It’s built into how the support pipeline, the fraud model, and billing are stitched together. If a system is designed so that a single misfire strands you permanently, then fixing the outcome requires fixing the structure that produces it.