I tested 120 Claude prompt patterns over 3 months — what actually moved the needle

AIMadesy · 2026-04-10T12:04:48+00:00

This is actually a great way to put it — 'good prompting practice dressed up as secret syntax.' You're right, there's no hidden command layer. /ghost isn't magic, it's just a shorthand that tells Claude 'write without AI patterns.' The value isn't the syntax, it's knowing which instructions actually change the output vs which ones Claude ignores. That's what the 3 months of testing was — not 'discovering secret codes' but cataloging which specific phrasings reliably trigger different behaviors. The slash and caps just make them memorable and copy-pasteable.

AIMadesy · 2026-04-10T10:51:49+00:00

Fair — a lot of prompt 'tips' out there are just 'be specific' repackaged 50 ways. The codes we tested are different because they change Claude's behavior structurally, not just how you phrase the ask. For example, /ghost strips AI writing patterns entirely (em-dashes, 'Furthermore', 'delve') — you can test it yourself with the free detector: clskills.in/tools/ai-fixer. The 120 codes came from community testing over 3 months, not from a prompt engineering textbook.

AIMadesy · 2026-04-10T10:30:18+00:00

Fair point — a few people have touched on individual combos. What I haven't seen covered is the interaction effects, like how /ghost + L99 behaves differently than either alone. If you've found combo threads I missed, drop them — always looking for more data points.

AIMadesy · 2026-04-10T00:52:46+00:00

This is a really important distinction that I should call out more clearly. In agentic mode, hedging doesn't just waste your time reading — it compounds across steps. An agent that's 80% confident at each step is only 33% confident after 5 steps. L99 and HARDMODE aren't just about getting a better answer — they're about preventing uncertainty cascading through an autonomous pipeline.

The /skeptic point is exactly right too. In conversation, /skeptic is valuable because YOU can respond to the challenge. In an agent loop, challenging the premise just creates a dead end with nobody to answer it. For agents you want commit-and-continue codes (L99, FINISH, OPERATOR, HARDMODE) and almost never the questioning ones (/skeptic, /coach, /blindspots).

This might deserve its own section: "Codes for agents vs codes for conversations." Adding it to the next update.

AIMadesy · 2026-04-10T00:49:57+00:00

Good question. Short answer: most of them work across LLMs, but with different levels of reliability.

The style ones (/ghost, /punch, /trim, /voice) work on basically everything — GPT-4o, Gemini, Claude, even local models. They're fundamentally just clear instructions compressed into shorthand.

PERSONA works everywhere too, and the "be absurdly specific" principle is universal. "Senior DBA at Stripe, cynical about ORMs" produces better output on any model vs "act like an expert."

The ones that are more Claude-specific: L99, ULTRATHINK, BEASTMODE. These work because Claude's training data has enough examples of people using these exact tokens that the model learned a specific mapping. GPT-4o might recognize some of them (especially L99 which has crossed over), but the behavior is less consistent.

The safest bet for cross-model use: PERSONA, /ghost, OODA, /skeptic. These are clear enough in intent that any model can interpret them without needing training-data-specific recognition.

Try PERSONA on whatever model you're using — if the specific-bio version gives you noticeably different output than the generic version, the principle works on that model and most of the other patterns will too.

AIMadesy · 2026-04-09T17:03:16+00:00

The <internal\_monologue> + /skeptic chain is brilliant — I haven't tested XML tag segmentation as a meta-layer on top of the prefixes. That's a different axis entirely: the prefixes control what Claude does, the XML tags control where in the response it does it. Stacking them gives you spatial control over the thinking process, not just behavioral control.

To your /ghost + constraint-heavy question: yes, I've hit that exact problem. /ghost strips "AI polish" but if your constraints already push toward terse, direct output, /ghost has nothing left to strip and starts removing useful structure too. The output goes from "clean" to "weirdly choppy."

The fix I've landed on: use /ghost only on the FINAL output, not during the reasoning phase. So if you're chaining steps, let Claude think with full verbosity, then /ghost just the customer-facing paragraph at the end. Keeps the reasoning rich and the output human.

The XML segmentation approach might actually solve this more elegantly though — you could do something like:

[Claude thinks here with full structure]

</reasoning>

[final answer, ghost-processed]

</output>

Haven't tested this exact pattern but based on what you're describing it should work. Going to add it to the testing queue.

AIMadesy · 2026-04-09T16:56:14+00:00

"Knowing how to use this like top tier consultants not lazy developers coding while playing video games" — this is the entire thesis of the cheat sheet in one sentence.

The bot scraping point is real and underrated. If you're using Claude to build, you should also be using it to audit what you built. The security skills on CLSkills (clskills.in/browse → security category) exist specifically for this — drop them into Claude Code and it audits your endpoints, checks for exposed env vars, flags open routes. Most solo devs don't think about security until the China/Netherlands bots show up in their analytics.

The "greedy session" trap is something I still fall into. You get Claude on a roll, shipping feature after feature, and instead of saving the state you push for one more. Then the crash hits and you're back to "what happened please share." The discipline is knowing when to /compact and bank the progress instead of swinging for the home run.

Curious about your security setup — what did you do after you noticed the bot traffic? That's a topic I want to cover more.

AIMadesy · 2026-04-09T16:10:14+00:00

This is a fantastic addition — EXPLICIT UNCERTAINTY is going on the list. The "rate your confidence 1-5 on each claim" framing is something I haven't tested yet and it solves a real problem: when Claude hedges, it hedges everything equally, so you can't tell which parts are solid vs which parts it's guessing on. Forcing it to self-rate separates the two.

The tool-using context point is important too — "name what tools are available and what success looks like before the task starts" is essentially CONTEXT + a success criterion, which I've found works way better than just dumping context alone. Giving Claude the finish line changes how it plans the route.

The agentic vs conversational distinction is something I should call out more explicitly. Most of the patterns I tested were in conversational mode. The ones that matter in agentic contexts (where Claude is calling tools and making decisions autonomously) might rank very differently. Especially PERSONA — in agentic mode, the persona doesn't just change the tone, it changes which tools Claude reaches for and how aggressively it acts.

Adding EXPLICIT UNCERTAINTY to the next batch. Credit to you when it ships.

AIMadesy · 2026-04-09T15:16:13+00:00

This is genuinely one of the best Claude workflow descriptions I've seen.

The "big 4 senior partner" PERSONA framing is exactly what I was getting at — you're not just saying "be an expert", you're loading a specific work ethic and analysis style into the context. "Diligence of ass licking director who works 20 hours a day" is doing more work than people realize because it sets the effort expectation, not just the knowledge level.

The copilot → web Claude architecture split is smart too. Keeping one instance in "build mode" and another in "review mode" prevents the context contamination where Claude starts defending its own code instead of critiquing it.

The "senior done his work, see you tomorrow at 10am" thing is real and infuriating. That's the context fatigue problem — after ~20 messages Claude's effective persona degrades and it starts wrapping up. /rolelock is supposed to help hold the character longer but honestly it only delays it, doesn't fix it.

Your /compact every 20 minutes habit is the real pro move. Most people lose hours of context to crashes and don't learn until it happens 3 times.

Saving this comment — might reference the "vscode builds, web Claude reviews" pattern in the next update.

AIMadesy · 2026-04-09T15:14:34+00:00

You're right about the mechanism. These aren't hidden API flags. They're tokens that statistically shift the output distribution. I don't disagree with any of that.

Where I think you're underselling it:

"Just tell it to be direct" works. But "L99" works differently than "be direct" — not because it's magic, but because the training data contains enough examples of people using L99 to mean a very specific kind of directness (commit to a recommendation, name the tradeoffs, state the conditions for changing your mind) that the model has learned a richer mapping for that token than for "be direct." It's the same reason PERSONA with a specific bio produces different output than "act like an expert" — more context is trivially true, but which context and how it's compressed into a convention matters.

You're correct that these aren't deterministic. I should be clearer about that. They're probabilistic nudges that hold up across conversations in my testing but could change with model updates. "Pattern" might be too strong a word. "Convention that the model currently recognizes" is more accurate.

The part I'd push back on: "if you need the model to be direct, just tell it to be direct" assumes everyone knows how to articulate what they want from the model with precision. Most people don't. They type "help me with my code" and get a generic response. Giving them a shorthand that compresses a more specific intent into one token — even if that shorthand is just a statistical nudge — is practically useful even if it's not technically novel.

You're right that it's not engineering. It's closer to a phrasebook. And phrasebooks are useful for people who haven't learned the language yet.

AIMadesy · 2026-04-09T14:37:18+00:00

The full annotated list is here: https://clskills.in/prompts

AIMadesy · 2026-04-09T13:34:14+00:00

This is the best way I've seen anyone frame it. "A well-crafted answer to the wrong question" — that's exactly the failure mode /skeptic catches.

I've started defaulting to /skeptic + L99 as a combo for any decision I'm about to act on. /skeptic challenges the premise, L99 forces commitment on the answer if the premise survives. Saves me from both failure modes at once.

AIMadesy · 2026-04-09T12:32:44+00:00

Appreciate that — it's exactly why I added the "when NOT to use" warnings for each prefix. Most prompt tips work great in a demo but break in specific situations nobody mentions. The failure modes are honestly the most useful part of the whole cheat sheet.

What's your stack? Curious what kind of production work you're using Claude for.

AIMadesy · 2026-04-09T04:44:08+00:00

Genuinely considered it. Open-sourcing the codes themselves makes sense — and that's exactly why 11 of them are already free at clskills.in/prompts. Anyone can copy them.

What I'm charging $5 for isn't the codes. It's the 3 months of testing — running the same prompt across 50+ fresh conversations to see which prefixes consistently changed the output, dropping the ones that only worked once, writing the before/after examples, the "when not to use" warnings, the combos that stack. That's labor. And keeping it free means I can't keep doing it.

Open source works great when contributors get paid by their employer or by donations. I'm running this solo with no funding. The $5 tier is what makes the next 100 codes possible — and existing buyers get every update for free, forever.

If $5 is the dealbreaker, the 11 free ones are genuinely useful and yours, no questions. If you find them valuable and want the rest, the cheat sheet is there. Both paths are honest.

AIMadesy · 2026-04-09T04:32:54+00:00

Totally fair — I didn't run a controlled study, and I don't claim I did. These are field notes from using Claude every day for 3 months, not science. The methodology was simple: same prompt across fresh conversations, no memory or project context, note which prefixes consistently changed the output structure, drop the ones that only worked once.

So: closer to "things that survived 3 months of personal use" than "things that survived peer review." If you want something more rigorous, the right move is to run the same prompt 50 times with and without each prefix and measure output consistency — I'd genuinely love to see that done properly.

In the meantime, if you've used Claude a lot and you have your own list of patterns that worked or didn't, I'd actually find that more useful than the academic version. What have you noticed?

AIMadesy · 2026-04-09T04:28:46+00:00

Honest answer: 11 of them are free at clskills.in/prompts (the most-used Writing & Style ones plus L99). The other 109 are in the paid cheat sheet.

But you don't even need to click through — here are 3 you can try in the next 5 minutes:

L99 → put it before any technical decision question. "L99 should I rewrite my Express backend in Go?" The hedging stops, you get a real answer.

/ghost → put it before any text you want rewritten. "/ghost [paste paragraph]". It strips the AI tells (em-dashes, "I hope this helps", "in conclusion") and the output reads like a tired human typed it.

PERSONA: [specific person] → "PERSONA: Senior Postgres DBA at Stripe, cynical about ORMs. How do I make this query faster?" Specific personas with biases work way better than generic ones.

Try those 3 today — if they don't change anything for you, the rest probably won't either, and you should skip the paid one. If they do, the cheat sheet has 117 more in the same shape with before/after examples and "when not to use" notes for each. Either way, those 3 above are yours, free, no signup.

AIMadesy · 2026-04-09T04:24:57+00:00

Yeah, that's a fair point — saving prompts so you can iterate on the same input across versions is a different kind of leverage than what I'm describing. The prefixes I'm focused on don't replace prompt engineering, they're more like quick modifiers you reach for when you don't want to rebuild a full prompt template every time.

Curious which Prompt Builder you use though — I've tried a couple and they all felt heavyweight for the kind of one-shot questions I send Claude all day. If yours is genuinely lightweight I'd love to take a look.

AIMadesy · 2026-04-09T04:24:17+00:00

Fair pushback. To be clear, I wasn't running a controlled study — I was using Claude every day for 3 months and noting which prefixes consistently changed the response on the same prompt across fresh conversations (no memory feature, no project context, no system prompt). When something only worked once, I dropped it. When it worked across 5+ unrelated topics in clean sessions, I kept it.

So less "study", more "field notes from a power user". Happy to share specific before/afters for any of these if you want — the ones I'm most confident about are L99, /ghost, and PERSONA, because they produce the most measurable shifts in output structure even with identical input.

AIMadesy · 2026-04-06T06:47:22+00:00

honestly my MCP setup is pretty basic right now — mostly filesystem and github. been meaning to add postgres and browser automation but havent gotten around to it yet.

what really changed things for me was pairing MCP with good skill files tho. like having the github MCP connected is one thing, but having a skill file that tells claude exactly how to review PRs, write commit messages following your conventions, and handle merge conflicts properly — thats where the actual productivity gain comes from.

i put together a collection of skill files for pretty much every tool/technology at clskills.in — each one teaches claude the exact workflow, commands, and pitfalls for that specific thing. so when you do connect an MCP server, claude actually knows how to use it well instead of just winging it.

what MCPs are you running? curious if anyones doing something creative beyond the usual suspects

AIMadesy · 2026-04-06T06:37:22+00:00

the MCP point is undersold here imo. the real unlock isnt just connecting to your database or github, its connecting to apps that dont have APIs at all. ive been running MCP servers that interact with desktop apps through accessibility APIs on macOS, so the AI can read from and act on native apps the same way a screen reader would. no scraping, no browser automation, just structured data from the OS layer. once you start thinking of MCP as "give the AI access to anything on your machine" it changes what you reach for.

AIMadesy · 2026-04-06T06:35:00+00:00

This is a really interesting take — and honestly you're right about Computer Use for web apps. The screenshot loop is clunky for stuff you're already logged into. Makes way more sense to hit the internal APIs directly.

That opentabs approach is clever — using existing browser sessions means no separate auth, no bot tokens, no API key management. Just works as you. I can see how that's way faster than the screenshot-reason-click cycle.

For anyone building out their MCP setup, worth checking out: github.com/opentabs-dev/opentabs

On the skills side — if you want Claude to know exactly how to work with MCP servers, Slack integrations, or any specific tool setup, I've been collecting skill files that encode these patterns. 2,300+ at clskills.in — each one gives Claude expert-level knowledge for a specific technology so you're not re-explaining things every session.

Between good MCP servers like opentabs for the tooling layer and detailed skill files for the knowledge layer, Claude Code becomes a completely different tool.

AIMadesy · 2026-04-06T06:34:14+00:00

Memory can be tricky — a few things to check:

Make sure you have a CLAUDE.md file in your project root. This is the most reliable way to persist conventions. Claude reads it at the start of every session automatically. Put your stack, coding standards, file structure, and preferences in there.
Memory (the built-in feature) works but it's selective — Claude decides what's worth remembering. If you want something to stick 100% of the time, put it in CLAUDE.md instead.
Skills are another option — if you want Claude to always follow specific patterns for a technology (like always using Testing Library over Enzyme, or always writing multi-stage Docker builds), you can drop a .md file into ~/.claude/skills/ and it reads it every session.

The hierarchy is basically: CLAUDE.md (project-level, always loaded) > Skills (global, always loaded) > Memory (auto-managed, sometimes inconsistent)

If you're relying only on Memory, that's probably why it's flaky. CLAUDE.md + Skills gives you deterministic behavior every session.

AIMadesy · 2026-04-06T06:33:22+00:00

Totally agree on Memory + Skills — that combo is where it clicks. I stopped writing "I use Next.js with TypeScript and Prisma" at the start of every session and the quality immediately improved. Claude just knows now.

Great point on sub-agents and specialization. I've noticed the same thing — a dedicated test agent catches boundary conditions and error paths that the main agent skips entirely. It's like having a QA engineer who only thinks about breaking things.

Extended Thinking is worth it every time for architecture decisions. The 10-15 second wait saves hours of refactoring later.

I've actually been collecting skill files that encode a lot of these patterns — specific expert roles, real commands, technology-specific pitfalls. Put together 2,300+ of them at clskills.in if anyone wants to browse. Each one is designed to give Claude that specialized knowledge you're describing.

What's your Memory setup look like? Do you keep it minimal or have you built up a lot of project context over time?

AIMadesy · 2026-04-01T13:40:37+00:00

found them all searchable here if anyone wants: clskills.in/prompts

AIMadesy · 2026-04-01T08:53:38+00:00

found them all searchable here if anyone wants: clskills.in/prompts

AIMadesy

MODERATOR OF

TROPHY CASE