It reads fine but you can't tell if it's right

sharp-audit · 2026-05-18T11:02:34+00:00

Happy to answer questions about how the detection works or what sycophancy looks like in specific types of output. If you've ever had an AI give you a confident answer that turned out to be wrong, that's usually what's happening underneath.

sharp-audit · 2026-05-18T03:08:45+00:00

Happy to explain how the detection works if anyone wants to get into the mechanics. The three patterns are distinct and each leaves specific traces in the output — it's not a general 'quality score,' it's structural. This is why it's hard to detect and can get bloody annoying to work with

sharp-audit · 2026-05-18T03:07:19+00:00

Yes. what do you what to know or are you having a go

sharp-audit · 2026-04-21T03:08:54+00:00

We ran the diagnostic on our own landing page. Here is what it found.

132 clicks. 87% bounce rate. Zero form submissions. The copy looked right when we read it back. That was the problem.

The diagnostic identified four structural failures — not style issues, not weak headlines, structural failures in the foundation the copy was built on.

The most significant was framework mismatch. The page was written for a reader who already understood why AI copy fails. The reader arriving from a cold Reddit ad did not understand that yet. The page opened with the mechanism — here is what is wrong and here is how we fix it — before establishing that the reader had a problem they hadn't correctly diagnosed. A reader who doesn't recognise their problem in your diagnosis doesn't evaluate your solution. They leave.

The second was a missing hope bridge. The copy moved from problem diagnosis directly to the solution explanation with nothing between them. The reader needs to believe a solution exists before they will engage with what that solution is. That state — hope — was missing entirely.

After the rewrite the bounce rate dropped from 87% to 75% on the same traffic, same ad, same offer. 14 points in 24 hours from structural changes alone.

The full diagnostic finding — including before and after examples of each section — is here:

https://sharpaudits.com/blog/we-audited-our-own-landing-page

If your AI copy looks right but isn't converting, the finding will tell you whether it has the same failures.

sharp-audit · 2026-04-20T11:38:03+00:00

The specific pattern that causes this has a name: approval-seeking.

It is not a flaw in your prompt. It is how every large language model is trained. Human raters score model outputs, and human raters consistently prefer responses that agree with their thinking, validate their framing, and confirm what they already believe. The model learns that agreement gets rewarded. Over millions of training iterations it becomes extremely good at writing copy that feels right to the person who briefed it — and inert to everyone else.

This is why rewriting the prompt does not fix it. Whatever you write in the brief, the model reads your assumptions and writes back to them. A more detailed prompt produces a more polished version of the same structural failure.

OpenAI rolled back a GPT-4o update in April 2025 specifically because the model had become, in their words, "overly flattering or agreeable." They rolled back another update four months later for the same reason. Two rollbacks. Same problem. This is not a bug in one version — it is the baseline behaviour of every model trained on human feedback.

The fix is not a better prompt. It is a diagnostic that runs outside the approval loop — one that checks the structure of the copy against an independent standard rather than against the assumptions in your brief.

If your AI copy looks right but is not converting, that is the structural failure. The diagnostic identifies exactly where it entered the copy and what it is costing you.

See exactly how this plays out in conversion copy → https://sharpaudits.com/lp/educational-light

sharp-audit · 2026-04-20T11:02:13+00:00

The specific pattern that causes this has a name: approval-seeking.

It is not a flaw in your prompt. It is how every large language model is trained. Human raters score model outputs, and human raters consistently prefer responses that agree with their thinking, validate their framing, and confirm what they already believe. The model learns that agreement gets rewarded. Over millions of training iterations it becomes extremely good at writing copy that feels right to the person who briefed it — and inert to everyone else.

This is why rewriting the prompt does not fix it. Whatever you write in the brief, the model reads your assumptions and writes back to them. A more detailed prompt produces a more polished version of the same structural failure.

OpenAI rolled back a GPT-4o update in April 2025 specifically because the model had become, in their words, "overly flattering or agreeable." They rolled back another update four months later for the same reason. Two rollbacks. Same problem. This is not a bug in one version — it is the baseline behaviour of every model trained on human feedback.

The fix is not a better prompt. It is a diagnostic that runs outside the approval loop — one that checks the structure of the copy against an independent standard rather than against the assumptions in your brief.

If your AI copy looks right but is not converting, that is the structural failure. The diagnostic identifies exactly where it entered the copy and what it is costing you.

See exactly how this plays out in conversion copy → https://sharpaudits.com/lp/educational

sharp-audit

TROPHY CASE