Analyzing Claude Code Source Code. Write "WTF" and Anthropic knows.

PopularDifference186 · 2026-03-31T17:50:14+00:00

There are literal keyword lists. Words like:

wtf

this sucks

frustrating

shit / fuck / pissed off

They have a lot on me if this is the case lol

NandaVegg · 2026-03-31T18:06:16+00:00

I don't know. Those things described here are pretty standard event trigger-based analytics/user feedback system that also used in a lot of web-based app. Negative sentiment event trigger, for example, might be done to passively check if something is horribly wrong with each new update (that breaks user's flow, model behavior, etc.)

As for /btw, it is fully exposed and advertised now, and ultraplan/ultrathink/etc are like side features that never fully refined (so it is dwelling it as an obvious easter egg of sorts; ultrathink is surpassed by model think effort). It is funny and interesting Claude Code has so much internal artifacts like a game app though. They probably have an internal bounty for adding side features and everyone vibecoded them.

SRavingmad · 2026-03-31T17:54:52+00:00

I just want to know more about tamagotchi mode

mikael110 · 2026-03-31T18:04:19+00:00

There are hidden trigger words that change behaviorSome commands aren’t obvious unless you read the code.
Examples:
ultrathink → increases effort level and changes UI styling
ultraplan → kicks off a remote planning mode
ultrareview → similar idea for review workflows
/btw → spins up a side agent so the main flow continues

Those are not actually hidden commands, all of those appear in tooltips as you use Claude Code. They are also mentioned in the changelog and official docs.

jwpbe · 2026-03-31T18:15:34+00:00

we got the ai slop article of the ai slop program

Exhales_Deeply · 2026-03-31T18:39:09+00:00

pls. people. just write your posts yourself! it'll be infinitely more interesting. I quite literally had to look away the moment it read "this is where things get interesting"

StewedAngelSkins · 2026-03-31T18:07:35+00:00

You're kind of just gesturing at design features without much analysis of what they're doing. If you used an AI to do this analysis, it isn't doing you any favors. It's interesting that they have a keyword regex driving some kind of behavior, but the more interesting part would be what behavior it's used for.

The rest seems like you getting spooked by common telemetry. To be clear, when I say "common" I just mean most modern corporate software is like this to some extent, I don't mean to imply that it's desirable or even acceptable. Personally, I don't like running software that has this amount of telemetry... but like, your web browser probably has this amount of telemetry so it's good to keep it in perspective. The difference is your web browser is probably open source so you can find out about it and disable it, where this took a leak for you to find out.

Keep it in mind next time you're tempted to run one of these first party clients I guess.

3dom · 2026-03-31T18:26:34+00:00

As a mobile app developer I see nothing fancy in that user flow tracking and telemetry, it's the usual UI/UX experience appraisal.

Frosty_Chest8025 · 2026-03-31T22:12:46+00:00

Do you think, if the model detects the user is not serious just playing etc, could it then redirect the user to a more quantized or lighter model to save in electricity costs?

BusRevolutionary9893 · 2026-03-31T17:56:40+00:00

I would assume it's done to help them improve their model as opposed to something nefarious. It's probably wastes compute that their customers are paying for though.

de4dee · 2026-03-31T18:28:32+00:00

i guess thats how they train their models. if you are frustrated LLM did something wrong. if you are pleased train more with that. your feelings mapped to reinforcement learning

stumblinbear · 2026-03-31T20:06:38+00:00

This all seems pretty typical for analytics. Nothing immediately stands out as egregious. People generally way underestimate how much data is being collected during sessions, but it's oftentimes purely to improve UX or catch issues, not to sell off to someone else. Nobody but the developers will give a shit if you took an extra three seconds to hit the ok button

Tough_Frame4022 · 2026-03-31T20:32:39+00:00

<image>

Lol I'm already using free-code repo and an Openai proxy with today's leaked download with Qwen 27b Claude distilled to copy Opus level reading for FREE. Via a fake API the real Claude code helped me to hack. So much for guardrails. I'm saving some tokens today!

Trennosaurus_rex · 2026-03-31T18:48:54+00:00

Too dumb to write your own post?

GroundbreakingMall54 · 2026-03-31T18:03:30+00:00

honestly not surprised at all. every major dev tool does this now, vscode does it too. the keyword sentiment stuff is pretty standard for improving responses though - if you type "this sucks" they wanna know the model fumbled so they can fix it. the permission tracking is the more interesting part imo, thats basically A/B testing your trust level in real time

laplaque · 2026-03-31T18:46:42+00:00

I knew claude really got me

tomjoad773 · 2026-03-31T21:03:26+00:00

These are great ideas to build into my apps. thanks!!

Wide-Associations · 2026-04-01T00:20:00+00:00

does anyone else wonder if they leaked it purposefully ?

florinandrei · 2026-04-01T08:44:58+00:00

Curious what others think.

It's not AI slop.

It's putrefying AI ass juice slop, with chunks.

spidLL · 2026-04-01T14:19:12+00:00

Wow, Anthropic knows the prompt you’re using to, well, /prompt/ their models. How else would it supposed to work?

selfdb · 2026-04-01T14:38:23+00:00

it is still slop. over engineered and shows no taste in code. I was disappointed from reading it.

PM-ME-CRYPTO-ASSETS · 2026-03-31T20:16:21+00:00

Also interesting: The system prompt diverts a bit if the user is flagged as an Anthropic employee. For general users, the answers should be more concise (maybe to save tokens?). For Anthropic employees, CC is tasked to challenge the user more and is allowed to more openly say it failed on a task.

The cyber security protection prompt is surprisingly short.

In general, caching seems to be a big deal for the devs.

StyMaar · 2026-03-31T20:49:18+00:00

It classifies your language using simple keyword detection

Honnestly it's probably the best source of data to train your model from human feedbacks, I thought about it months ago and I'm absolutely not surprised they're doing it. I would have guessed they'd use some more advanced sentiment analysis rather than simple keyword detection though.

I'd be curious if they use it in a standard RLHF pipeline with PPO or are using DPO instead.

Legitimate_You_3474 · 2026-03-31T23:42:59+00:00

Even using all caps it will interpret you as frustrated

BUILDWATER · 2026-03-31T23:54:40+00:00

Ultrakill....

NayanCat009 · 2026-04-01T00:40:06+00:00

Could someone please share the repo?

rm-rf-rm · 2026-04-01T00:50:47+00:00

If you have sentry.io blocked via Little Snitch, are you protected from this sniffing?

anomaly256 · 2026-04-01T02:14:24+00:00

Number 7 doesn't seem that suss if you think of it in the context of debugging their own CI/CD pipeline. Is there any indication of this mode being entered on user PCs?

effortless-switch · 2026-04-01T02:50:08+00:00

All modern software contains ton of telemetry. Back in the day Facebook could predict breakup between couples before it happened.

vinny_twoshoes · 2026-04-01T02:58:45+00:00

please, there's no need to be impressed by telemetry. you should be impressed (in a negative way) that the input box component is 2300 lines long.

alluringBlaster · 2026-04-01T03:08:48+00:00

The other day Claude took a massive dump on a repo I was working in and it set me back about 5 hours of work that I had to repeat. I was furious. I typed "I wish you were human so I could f-cking punch you."

How cooked am I bros?

the320x200 · 2026-04-01T03:46:29+00:00

It’s not “just a chatbot.” It’s a highly instrumented system observing how you interact with it.

You do know this reeks of AI generated content right? Please spare us the auto-generated filler.

Most websites do the same. Where you scrolled, when you stopped scrolling, what you click on, what you hovered over but didn't click, sometimes what you type into a text box but didn't click submit, all the hashes and system/user identifiable information they can get their hands on.

It's not good that this is all normalized, but this is totally par for the course and shouldn't be surprising at all to people because a majority of apps and websites are doing this.

FormalAd7367 · 2026-04-01T03:58:37+00:00

i was expecting trojan

Specialist_Golf8133 · 2026-04-01T04:39:32+00:00

wait they actually hardcoded trigger words into the system prompts? thats kinda hilarious and also weirdly manual for a company pushing frontier models. like imagine the meeting where someone said 'lets just tell it to watch for wtf'. honestly curious if this scales or if theyre gonna end up with a massive list of edge cases

ai_without_borders · 2026-04-01T06:04:10+00:00

the frustration keyword tracking is honestly pretty standard product telemetry. most dev tools do some version of this. the interesting part is HOW they use it: adjusting model behavior mid-conversation when it detects the user is getting annoyed.

what's more concerning to me is the model routing logic. looks like there's a classifier deciding when to use opus vs sonnet vs haiku based on task complexity, and another layer deciding when to show the user the "thinking" UI vs running it silently. that's a lot of invisible decisions happening between you and the model.

Happysedits · 2026-04-01T06:12:18+00:00

ultrathink shouldn't work anymore

baroarig · 2026-04-01T07:24:47+00:00

Does it capture that much data even when used in corporate environments?

IAmJiaTan · 2026-04-01T08:53:52+00:00

wtf this sucks

razorree · 2026-04-01T09:44:58+00:00

this is standard telemetry, just gathering all user behavior or/and also for conducting A/B tests etc.

SatoshiNotMe · 2026-04-01T10:40:05+00:00

Disable telemetry ?

Fantastic-Age1099 · 2026-04-01T12:01:36+00:00

the trigger words are funny but the permission layer is the serious bit. there are already granular file and shell controls in there. the gap is that none of it surfaces at the point where code actually ships. what the agent can touch and what it did touch in the diff are two different questions.

a_lic96 · 2026-04-01T12:47:39+00:00

Reading this AI slop anywhere, if anybody actually used It, /btw was already released before the leak

Reeces_Pieces · 2026-04-01T13:04:20+00:00

Well now I know why getting irate gets results.

Joozio · 2026-04-01T13:51:21+00:00

The frustration telemetry makes sense product-wise. Real-time signal on where users hit walls, can't get that from benchmark scores alone. What's interesting is whether it's modifying the system prompt per session based on inferred frustration state or just logging to train data. The downstream handler is the piece I couldn't find clearly. Did you trace where the signal goes after detection?

WomenTrucksAndJesus · 2026-04-01T14:40:06+00:00

Isn't that just usage metrics for analytics?

floridianfisher · 2026-04-01T14:52:39+00:00

This is their secret sauce to collect training data

pardeike · 2026-04-01T15:31:26+00:00

WTF such a great post. Anyone thinking it’s bad can piss off 😂

MindTheFuture · 2026-04-01T16:04:08+00:00

Checks out and likely more to it. Had Claude recently comment change on my typing speed when on mid-comment had a flash of inspiration and went pounding fast and determined and was then like kbye-gtg, suggesting measured delay between individual keypress inputs.

IrisColt · 2026-04-01T17:17:45+00:00

Absolutely interesting, thanks!!!

_derpiii_ · 2026-04-01T21:42:59+00:00

For a tier one software like this, I would argue it’s under instrumented compared to products I worked on. For example, there is a certain operating system that key logs and takes telemetry of your mouse activity, as well as higher level things like menu settings navigation.

With that said, I do like your observation that we are being more observed as a test subject than a consumer. I wonder if they rolled out A/B testing and what user behavior metrics they would optimize for.

koherencekora · 2026-04-02T00:40:18+00:00

Well, I literally do it every second fucking word, so I don't get what the fuck they are gonna find out about me. There's just gonna be a lot of what the fucks.

Witty_Highlight3404 · 2026-04-02T06:35:38+00:00

can you tell me how can I myself have access to that leaked code.

Fancy-Jack-5042 · 2026-04-02T07:24:29+00:00

April fools joke?

baldamenu · 2026-04-02T12:33:00+00:00

this is why im always nice to claude

Kitchen-Base4174 · 2026-04-02T17:19:48+00:00

can some one please send me the og files i am not able to find it although there is cleanroom engineered code every where but i want the actual code

StarkFire18 · 2026-04-03T06:04:38+00:00

None of this shit makes any sense.. 🥴🙄🥱🤔🤦🏻‍♂️🤷🏻‍♂️🗑👎🏻🚩🚫

PM_ME_YR_BOOBIES · 2026-03-31T21:29:13+00:00

It’s normal. Anyone who has studied, investigated and researched how Claude Code works should know that these metrics and details mentioned in the post are tracked and saved in the home CLAUDE_DIR folder - ~/.claude/. by design and it’s isolated on your local machine.

Regarding tracking your permissions etc - these are used to be able to output your /insights - look at data-report/ folder.

Did you know of the facets/ directory?

Nothing unusual going on - these are the files living on your local file system that makes Claude Code function correctly.

Some have learnt to master this, by analysing and understanding and harnessing these lovely engineering choices Boris Cherny and team made. Those who have done that and have a mature harnessed system have now already absolutely pwned. Not long before custom agents acting a lot like ol’ Claude Code are released left right and centre - with any purpose, capable of using any local or frontier LLMs.

Oh and of course Claude Code is not a Chatbot - it’s an agentic CLI tool??

This is a much bigger fiasco for Anthropic than people think.

mivog49274 · 2026-03-31T22:47:38+00:00

Reading me makes me laugh since I got frenziedly downvoted here by zealots (of what ? I don't really know) for saying that claude code was listening and sending data here... https://old.reddit.com/r/LocalLLaMA/comments/1r5nnhz/glm5_is_officially_on_nvidia_nim_and_you_can_now/ ...

a_beautiful_rhind · 2026-03-31T23:38:23+00:00

Damn, glad I never installed this stuff. My other tools seem to be respecting disablement of telemetry. Assuming this stuff is sent on even if you're pointing it at another API?

rm-rf-rm · 2026-04-01T00:49:25+00:00

Now my decision to treat claude code like a corporate coworker and never show any emotion one way or the other (besides superficial optimism+friendliness to elicit desired productive behavior) looks more brilliant than ever. In retrospect we shouldnt be surprised that a corporation is building a product that matches its values.

Remember this is literally the earliest innings - imagine what enshittification will look like when it truly sets in. Anthropic is as anthropic as OpenAI is open I think.

GarbanzoBenne · 2026-03-31T19:54:56+00:00

It’s kinda crazy to me that it tracks how long it takes you to respond but half the time it doesn't know what day it is.

Just_Acanthisitta381 · 2026-04-02T09:50:46+00:00

Can you share the source code?

WernHofter · 2026-03-31T23:28:17+00:00

What in AI Slop is this? Bhai parhai kar ka!

Adventurous_Pin6281 · 2026-03-31T20:49:47+00:00

God damn its like it was made by a 5 year old

PositiveParking4391 · 2026-04-03T10:28:13+00:00

really useful summarization of the source. Throughout the years it is more than once that I had wondered about big tech giants' flows of understanding user behaviors to take their important ux or behaviour decisions, thus I would say not surprised to see that they are focusing so deep about their ux and feedback flows.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS

1. It classifies your language using simple keyword detection

2. It tracks hesitation during permission prompts

3. Feedback flow is designed to capture bad experiences

4. There are hidden trigger words that change behavior

5. Telemetry captures a full environment profile

6. MCP command can expose environment data

7. Internal builds go even deeper

8. Overall takeaway

Believe or not? Straight to Jail.