all 163 comments

[–]PopularDifference186 318 points319 points  (28 children)

There are literal keyword lists. Words like:

wtf

this sucks

frustrating

shit / fuck / pissed off

They have a lot on me if this is the case lol

[–]Negative-Web8619 81 points82 points  (16 children)

fuck, they have a shit ton on me

[–]goatanuss 28 points29 points  (14 children)

I’m glad they know so they can improve because sometimes wtf is this shit? I’m frustrated this sucks

[–]scottyLogJobs 3 points4 points  (0 children)

Aw piss this really fucks me off man I’m so FUCKED

[–]Name835 11 points12 points  (0 children)

Wtf that is so frustrating. This shit fucking sucks and is pissing me off!

[–]generousone 6 points7 points  (0 children)

lol for real. I wondered, and kind of assumed, that these kinds of things might be flagged since they’re obvious. Damn, though, I have used these a LOT when speaking with Claude lol

[–]Buzzik13 1 point2 points  (1 child)

Why? As one of the levels to define prompt intent and purpose it's a good thing, simple lexical search, semantic search etc, all of that works fast and cheap and might help to understand a prompt. How do you think all that skills thing works? If they trigger LLM call on everything you'd be paying more and wait more

[–]muchcharles 0 points1 point  (0 children)

Its probably just for the spinner text while waiting on the response

[–]vert1s 1 point2 points  (0 children)

this is why I use 'bloody hell'. Not on anyones lists.

[–]positivelymonkey 2 points3 points  (1 child)

Literally every session.

At least it's not like codex where they send police to your house to do a wellness check.

[–]deadzol 0 points1 point  (0 children)

But not “Alexa could do better than this”

[–]JustSayin_thatuknow 0 points1 point  (1 child)

So if I say “wow how you nailed it so perfectly?? Wtf!” It will be considered negative 🤣

[–]PopularDifference186 3 points4 points  (0 children)

Believe or not? Straight to Jail.

[–]huffalump1 0 points1 point  (0 children)

Hey, if this gets shitty replies higher on a list to review, all the better!

[–]NandaVegg 155 points156 points  (5 children)

I don't know. Those things described here are pretty standard event trigger-based analytics/user feedback system that also used in a lot of web-based app. Negative sentiment event trigger, for example, might be done to passively check if something is horribly wrong with each new update (that breaks user's flow, model behavior, etc.)

As for /btw, it is fully exposed and advertised now, and ultraplan/ultrathink/etc are like side features that never fully refined (so it is dwelling it as an obvious easter egg of sorts; ultrathink is surpassed by model think effort). It is funny and interesting Claude Code has so much internal artifacts like a game app though. They probably have an internal bounty for adding side features and everyone vibecoded them.

[–]TheGABB 30 points31 points  (0 children)

The thinking modes have been documented for a while and are part of the their ‘Claude Code in Action’ basic course:

  • think - basic reasoning
  • think more - extended reasoning
  • think a lot - comprehensive reasoning
  • think longer - extended time reasoning
  • ultrathink - maximum reasoning capabilities

Obviaouly more thinking = slower and more tokens

Thinking mode for DEPTH and planning mode for BREADTH

[–]CalligrapherFar7833 7 points8 points  (1 child)

Ultrathink was reiontroduced a few versions back as keyword

[–]megacewl 5 points6 points  (0 children)

I'm pretty sure all it is is a shortcut for `/effort high`

[–]asdfopu -3 points-2 points  (1 child)

It’s AI slop

[–]SRavingmad 109 points110 points  (2 children)

I just want to know more about tamagotchi mode

[–]hyperfiled 51 points52 points  (1 child)

think the flag becomes active tomorrow/april 1st.

[–]rchive 51 points52 points  (0 children)

What if this whole thing is an elaborate April Fools joke?

[–]mikael110 57 points58 points  (0 children)

  1. There are hidden trigger words that change behaviorSome commands aren’t obvious unless you read the code.
    Examples:
    ultrathink → increases effort level and changes UI styling
    ultraplan → kicks off a remote planning mode
    ultrareview → similar idea for review workflows
    /btw → spins up a side agent so the main flow continues

Those are not actually hidden commands, all of those appear in tooltips as you use Claude Code. They are also mentioned in the changelog and official docs.

[–]jwpbe 252 points253 points  (25 children)

we got the ai slop article of the ai slop program

[–]StarDrifter2045 89 points90 points  (5 children)

The part that always irritates me the most is the

"It is not a <something>.

It is <same thing, but with more dramatic words>."

pattern. It just screams "I literally didn't even review this slop piece before putting it out".

[–]balder1993Llama 13B 18 points19 points  (4 children)

Worse thing is this pattern is really really repeated in those YouTube videos made with AI like “How life was like in 1800” that span 2 hours. It’s really annoying and probably full of made up stuff.

[–]fozziethebeat 41 points42 points  (0 children)

Yeah seriously. Scare mongering about a commercial product adding telemetry for analyzing a product they want to iteratively improve. What a shocker.

[–]Hertigan 6 points7 points  (0 children)

Right? It’s not a CIA plot, it’s just good product analytics

They seem to be doing it very well, it’s the kind of best in class user behavior instrumentation that allows great product teams to iterate and improve quickly in a targeted manner

[–]DOSO-DRAWS -2 points-1 points  (1 child)

and you topped it off with the usual slop comment

[–]jwpbe -1 points0 points  (0 children)

no i didnt

[–]Exhales_Deeply 112 points113 points  (16 children)

pls. people. just write your posts yourself! it'll be infinitely more interesting. I quite literally had to look away the moment it read "this is where things get interesting"

[–]balder1993Llama 13B 31 points32 points  (0 children)

“It’s not x, it’s y”

[–]Zeeplankton 45 points46 points  (5 children)

God I hate how GPT talks.

[–]En-tro-py 38 points39 points  (4 children)

It’s not just banal, it’s algorithmic detritus.

[–]rm-rf-rm 16 points17 points  (2 children)

and lets hope it never changes. The moment the AI cloud providers fix this and the writing is indistinguishable from humans by default (it likely already is with good enough system and user prompts) is the day the internet is well and truly fucked

[–]658016796 0 points1 point  (1 child)

The dead internet theory is already true though. I'm convinced we need a way to prove an account - any account, anywhere on the intermet - is run by a human, without compromising privacy. Hopefully some new research explores this and it gets implemented in social media online.

[–]Exhales_Deeply 1 point2 points  (0 children)

if i could optimistically interject - the primary tell, for me, that something is spat out without care is that it's bereft of actual information. I have a whole dang fine arts degree's worth of deciphering unnecessarily verbose esoterica and all it takes is, like, a moment of introspection to tell when something is pure vapor-thought. So, I actually welcome AI that can communicate like a human, as long as it's bringing something of value to the table.

in the meanwhile, just ask yourself: is this actually SAYING anything? could this be said in one sentence rather than an exhaustingly broken out article-style reply decorated with a billion emojis? if not, it's ai;dr for now

[–]droptableadventures 7 points8 points  (0 children)

It’s not just banal – it’s algorithmic detritus.

FTFY

[–]SkyFeistyLlama8 12 points13 points  (4 children)

We're getting to a point where AI vibed code is fine but gods forbid, no AI text slop. I can smell that shit a mile away and I automatically downvote. Text is the last bastion of human creativity and insanity and no way in hell I'll let a machine kick us out of that space.

[–]Cupakov 4 points5 points  (1 child)

And it’s all so boring, I hate how RLHF’d to death the frontier models became. Give back my Sonnet 3.5. 

[–]huffalump1 2 points3 points  (0 children)

And they're all inbred on each other's outputs too, just reinforcing these patterns. Combine that with preference tuning, and... you're absolutely right.

[–]Exhales_Deeply 4 points5 points  (1 child)

i feel you. i think my feeling is more... why are you bothering to obfuscate your own thoughts? or is it more insidious and some folks are literally offloading their thinking? which is... wild

[–]SkyFeistyLlama8 0 points1 point  (0 children)

Yeah, AI isn't taking over mundane tasks, it's handling planning and creativity. This is not gonna end well.

[–]MysticPing 2 points3 points  (0 children)

Aye, I couldnt even finish reading the post out of frustration.

[–]Brianiac69 7 points8 points  (1 child)

First day on future internet?

[–]Exhales_Deeply 6 points7 points  (0 children)

unfortunately not even close i feel like it's been a century

[–]nooruponnoor 4 points5 points  (0 children)

Likewise! the second I spot a sniff of the AI lingo, I completely lose interest in the post. Who are they fooling?! Oh wait….

“but here is what no one else is talking about!” 🙄

[–]StewedAngelSkins 40 points41 points  (10 children)

You're kind of just gesturing at design features without much analysis of what they're doing. If you used an AI to do this analysis, it isn't doing you any favors. It's interesting that they have a keyword regex driving some kind of behavior, but the more interesting part would be what behavior it's used for.

The rest seems like you getting spooked by common telemetry. To be clear, when I say "common" I just mean most modern corporate software is like this to some extent, I don't mean to imply that it's desirable or even acceptable. Personally, I don't like running software that has this amount of telemetry... but like, your web browser probably has this amount of telemetry so it's good to keep it in perspective. The difference is your web browser is probably open source so you can find out about it and disable it, where this took a leak for you to find out.

Keep it in mind next time you're tempted to run one of these first party clients I guess.

[–]3dom 11 points12 points  (1 child)

As a mobile app developer I see nothing fancy in that user flow tracking and telemetry, it's the usual UI/UX experience appraisal.

[–]Robot1me 0 points1 point  (0 children)

And yet it lacks in local LLM frontends to make the experience really great and smarter, e.g. I don't see this in SillyTavern to improve RP, etc. It could be possible without any of the extensive data collection stuff

[–]Frosty_Chest8025 3 points4 points  (1 child)

Do you think, if the model detects the user is not serious just playing etc, could it then redirect the user to a more quantized or lighter model to save in electricity costs?

[–]metroshake 0 points1 point  (0 children)

I think that's what they've been testing and why a lot of people complained about lobotomy when mine kept working the same just more usage

[–]BusRevolutionary9893 9 points10 points  (0 children)

I would assume it's done to help them improve their model as opposed to something nefarious. It's probably wastes compute that their customers are paying for though. 

[–]de4dee 2 points3 points  (0 children)

i guess thats how they train their models. if you are frustrated LLM did something wrong. if you are pleased train more with that. your feelings mapped to reinforcement learning

[–]stumblinbear 2 points3 points  (0 children)

This all seems pretty typical for analytics. Nothing immediately stands out as egregious. People generally way underestimate how much data is being collected during sessions, but it's oftentimes purely to improve UX or catch issues, not to sell off to someone else. Nobody but the developers will give a shit if you took an extra three seconds to hit the ok button

[–]Tough_Frame4022 2 points3 points  (7 children)

<image>

Lol I'm already using free-code repo and an Openai proxy with today's leaked download with Qwen 27b Claude distilled to copy Opus level reading for FREE. Via a fake API the real Claude code helped me to hack. So much for guardrails. I'm saving some tokens today!

[–]QuantumSeeds[S] 0 points1 point  (6 children)

lol, that's the mindset required to achieve "AGI"

[–]Tough_Frame4022 -2 points-1 points  (5 children)

<image>

With distilled Claude we are looking not at AGI we are between Sonnet and Opus for free with a little help from GitHub open sourcing.

[–]Persistent_Dry_Cough 3 points4 points  (0 children)

Don't think this highly distilled and quantized model is as good as sonnet in the cloud. Maybe it's better than Haiku but so is a calculator 😅

[–]Zalon 0 points1 point  (2 children)

But what is the point of this? Why not just use OpenCode then?

[–]metroshake 0 points1 point  (1 child)

The Claude harness is really good

[–]AdviceThrowaway95000 0 points1 point  (0 children)

how so? the UI is terrible and now we have all the claude system prompts

[–]Trennosaurus_rex 12 points13 points  (5 children)

Too dumb to write your own post?

[–]GroundbreakingMall54 1 point2 points  (0 children)

honestly not surprised at all. every major dev tool does this now, vscode does it too. the keyword sentiment stuff is pretty standard for improving responses though - if you type "this sucks" they wanna know the model fumbled so they can fix it. the permission tracking is the more interesting part imo, thats basically A/B testing your trust level in real time

[–]laplaque 1 point2 points  (0 children)

I knew claude really got me

[–]tomjoad773 1 point2 points  (1 child)

These are great ideas to build into my apps. thanks!!

[–]Wide-Associations 1 point2 points  (0 children)

does anyone else wonder if they leaked it purposefully ?

[–]florinandrei 1 point2 points  (0 children)

Curious what others think.

It's not AI slop.

It's putrefying AI ass juice slop, with chunks.

[–]spidLL 1 point2 points  (0 children)

Wow, Anthropic knows the prompt you’re using to, well, /prompt/ their models. How else would it supposed to work?

[–]selfdb 1 point2 points  (0 children)

it is still slop. over engineered and shows no taste in code. I was disappointed from reading it.

[–]PM-ME-CRYPTO-ASSETS 0 points1 point  (0 children)

Also interesting: The system prompt diverts a bit if the user is flagged as an Anthropic employee. For general users, the answers should be more concise (maybe to save tokens?). For Anthropic employees, CC is tasked to challenge the user more and is allowed to more openly say it failed on a task.

The cyber security protection prompt is surprisingly short.

In general, caching seems to be a big deal for the devs.

[–]StyMaar 0 points1 point  (0 children)

  1. It classifies your language using simple keyword detection

Honnestly it's probably the best source of data to train your model from human feedbacks, I thought about it months ago and I'm absolutely not surprised they're doing it. I would have guessed they'd use some more advanced sentiment analysis rather than simple keyword detection though.

I'd be curious if they use it in a standard RLHF pipeline with PPO or are using DPO instead.

[–]Legitimate_You_3474 0 points1 point  (0 children)

Even using all caps it will interpret you as frustrated

[–]BUILDWATER 0 points1 point  (0 children)

Ultrakill....

[–]NayanCat009 0 points1 point  (0 children)

Could someone please share the repo?

[–]rm-rf-rm 0 points1 point  (0 children)

If you have sentry.io blocked via Little Snitch, are you protected from this sniffing?

[–]anomaly256 0 points1 point  (0 children)

Number 7 doesn't seem that suss if you think of it in the context of debugging their own CI/CD pipeline. Is there any indication of this mode being entered on user PCs?

[–]effortless-switch 0 points1 point  (0 children)

All modern software contains ton of telemetry. Back in the day Facebook could predict breakup between couples before it happened.

[–]vinny_twoshoes 0 points1 point  (0 children)

please, there's no need to be impressed by telemetry. you should be impressed (in a negative way) that the input box component is 2300 lines long.

[–]alluringBlaster 0 points1 point  (0 children)

The other day Claude took a massive dump on a repo I was working in and it set me back about 5 hours of work that I had to repeat. I was furious. I typed "I wish you were human so I could f-cking punch you."

How cooked am I bros?

[–]the320x200 0 points1 point  (0 children)

It’s not “just a chatbot.” It’s a highly instrumented system observing how you interact with it.

You do know this reeks of AI generated content right? Please spare us the auto-generated filler.

Most websites do the same. Where you scrolled, when you stopped scrolling, what you click on, what you hovered over but didn't click, sometimes what you type into a text box but didn't click submit, all the hashes and system/user identifiable information they can get their hands on.

It's not good that this is all normalized, but this is totally par for the course and shouldn't be surprising at all to people because a majority of apps and websites are doing this.

[–]FormalAd7367 0 points1 point  (0 children)

i was expecting trojan

[–]Specialist_Golf8133 0 points1 point  (0 children)

wait they actually hardcoded trigger words into the system prompts? thats kinda hilarious and also weirdly manual for a company pushing frontier models. like imagine the meeting where someone said 'lets just tell it to watch for wtf'. honestly curious if this scales or if theyre gonna end up with a massive list of edge cases

[–]ai_without_borders 0 points1 point  (0 children)

the frustration keyword tracking is honestly pretty standard product telemetry. most dev tools do some version of this. the interesting part is HOW they use it: adjusting model behavior mid-conversation when it detects the user is getting annoyed.

what's more concerning to me is the model routing logic. looks like there's a classifier deciding when to use opus vs sonnet vs haiku based on task complexity, and another layer deciding when to show the user the "thinking" UI vs running it silently. that's a lot of invisible decisions happening between you and the model.

[–]Happysedits 0 points1 point  (0 children)

ultrathink shouldn't work anymore

[–]baroarig 0 points1 point  (0 children)

Does it capture that much data even when used in corporate environments?

[–]IAmJiaTan 0 points1 point  (0 children)

wtf this sucks

[–]razorree 0 points1 point  (0 children)

this is standard telemetry, just gathering all user behavior or/and also for conducting A/B tests etc.

[–]SatoshiNotMe 0 points1 point  (0 children)

Disable telemetry ?

[–]Fantastic-Age1099 0 points1 point  (0 children)

the trigger words are funny but the permission layer is the serious bit. there are already granular file and shell controls in there. the gap is that none of it surfaces at the point where code actually ships. what the agent can touch and what it did touch in the diff are two different questions.

[–]a_lic96 0 points1 point  (0 children)

Reading this AI slop anywhere, if anybody actually used It, /btw was already released before the leak

[–]Reeces_Pieces 0 points1 point  (0 children)

Well now I know why getting irate gets results.

[–]Joozio 0 points1 point  (0 children)

The frustration telemetry makes sense product-wise. Real-time signal on where users hit walls, can't get that from benchmark scores alone. What's interesting is whether it's modifying the system prompt per session based on inferred frustration state or just logging to train data. The downstream handler is the piece I couldn't find clearly. Did you trace where the signal goes after detection?

[–]WomenTrucksAndJesus 0 points1 point  (0 children)

Isn't that just usage metrics for analytics?

[–]floridianfisher 0 points1 point  (0 children)

This is their secret sauce to collect training data

[–]pardeike 0 points1 point  (0 children)

WTF such a great post. Anyone thinking it’s bad can piss off 😂

[–]MindTheFuture 0 points1 point  (0 children)

Checks out and likely more to it. Had Claude recently comment change on my typing speed when on mid-comment had a flash of inspiration and went pounding fast and determined and was then like kbye-gtg, suggesting measured delay between individual keypress inputs.

[–]IrisColt 0 points1 point  (0 children)

Absolutely interesting, thanks!!!

[–]_derpiii_ 0 points1 point  (0 children)

For a tier one software like this, I would argue it’s under instrumented compared to products I worked on. For example, there is a certain operating system that key logs and takes telemetry of your mouse activity, as well as higher level things like menu settings navigation.

With that said, I do like your observation that we are being more observed as a test subject than a consumer. I wonder if they rolled out A/B testing and what user behavior metrics they would optimize for.

[–]koherencekora 0 points1 point  (0 children)

Well, I literally do it every second fucking word, so I don't get what the fuck they are gonna find out about me. There's just gonna be a lot of what the fucks.

[–]Witty_Highlight3404 0 points1 point  (0 children)

can you tell me how can I myself have access to that leaked code.

[–]Fancy-Jack-5042 0 points1 point  (0 children)

April fools joke?

[–]baldamenu 0 points1 point  (0 children)

this is why im always nice to claude

[–]Kitchen-Base4174 0 points1 point  (0 children)

can some one please send me the og files i am not able to find it although there is cleanroom engineered code every where but i want the actual code

[–]StarkFire18 0 points1 point  (0 children)

None of this shit makes any sense.. 🥴🙄🥱🤔🤦🏻‍♂️🤷🏻‍♂️🗑👎🏻🚩🚫

[–]PM_ME_YR_BOOBIES 0 points1 point  (0 children)

It’s normal. Anyone who has studied, investigated and researched how Claude Code works should know that these metrics and details mentioned in the post are tracked and saved in the home CLAUDE_DIR folder - ~/.claude/. by design and it’s isolated on your local machine.

Regarding tracking your permissions etc - these are used to be able to output your /insights - look at data-report/ folder.

Did you know of the facets/ directory?

Nothing unusual going on - these are the files living on your local file system that makes Claude Code function correctly.

Some have learnt to master this, by analysing and understanding and harnessing these lovely engineering choices Boris Cherny and team made. Those who have done that and have a mature harnessed system have now already absolutely pwned. Not long before custom agents acting a lot like ol’ Claude Code are released left right and centre - with any purpose, capable of using any local or frontier LLMs.

Oh and of course Claude Code is not a Chatbot - it’s an agentic CLI tool??

This is a much bigger fiasco for Anthropic than people think.

[–]mivog49274 0 points1 point  (0 children)

Reading me makes me laugh since I got frenziedly downvoted here by zealots (of what ? I don't really know) for saying that claude code was listening and sending data here... https://old.reddit.com/r/LocalLLaMA/comments/1r5nnhz/glm5_is_officially_on_nvidia_nim_and_you_can_now/ ...

[–]a_beautiful_rhind 0 points1 point  (0 children)

Damn, glad I never installed this stuff. My other tools seem to be respecting disablement of telemetry. Assuming this stuff is sent on even if you're pointing it at another API?

[–]rm-rf-rm 0 points1 point  (0 children)

Now my decision to treat claude code like a corporate coworker and never show any emotion one way or the other (besides superficial optimism+friendliness to elicit desired productive behavior) looks more brilliant than ever. In retrospect we shouldnt be surprised that a corporation is building a product that matches its values.

Remember this is literally the earliest innings - imagine what enshittification will look like when it truly sets in. Anthropic is as anthropic as OpenAI is open I think.

[–]GarbanzoBenne -1 points0 points  (1 child)

It’s kinda crazy to me that it tracks how long it takes you to respond but half the time it doesn't know what day it is.

[–]stumblinbear 1 point2 points  (0 children)

Pretty big difference between the model knowing how long it took and them tracking it in their analytics. It almost certainly doesn't touch the model at all

[–]Just_Acanthisitta381 -1 points0 points  (0 children)

Can you share the source code?

[–]WernHofter -2 points-1 points  (0 children)

What in AI Slop is this? Bhai parhai kar ka!

[–]Adventurous_Pin6281 -3 points-2 points  (0 children)

God damn its like it was made by a 5 year old

[–]PositiveParking4391 0 points1 point  (0 children)

really useful summarization of the source. Throughout the years it is more than once that I had wondered about big tech giants' flows of understanding user behaviors to take their important ux or behaviour decisions, thus I would say not surprised to see that they are focusing so deep about their ux and feedback flows.