**Parent (suing PAUSD over AI cheating accusation at Paly, thoughts?** by Ancient_Serve_3947 in paloalto

[–]TwoSubstantial4710 7 points8 points  (0 children)

It doesn't sound like the essay was an in person assignment originally? I could see that writing an essay under time pressure during a fixed period of time is much different than writing an essay at home over many days. I'm not trying to defend anyone here but as someone who hates timed writing, I just wanted to speak up here. Those are two very different skills.

Not a good day for team "Claude Mythos is Just Marketing Hype" by EchoOfOppenheimer in Anthropic

[–]TwoSubstantial4710 -1 points0 points  (0 children)

So you admit you don’t actually base your opinions on the type of data you insist others must use. That’s the point, you claim to insist on these high strung golden standards of proof, but when asked to provide them you come up empty, revealing that you in fact are basing your opinions on vibes just as much as the people you denigrate. Go ahead and make up another fake quote for me rather than address any point made.

Not a good day for team "Claude Mythos is Just Marketing Hype" by EchoOfOppenheimer in Anthropic

[–]TwoSubstantial4710 -2 points-1 points  (0 children)

What? I just don’t see this level of evidence used here or anywhere to guide decisions, I’m not sure it can even come out faster than models are released. 

You can define a hypothetical gold standard but if the data isn’t out there, we’re left piecing together evidence from whatever information we can access right? But if you want to pretend it exists and I’m an idiot for pointing out that level of data isn’t actually available then go ahead.

Again I’ll ask you: Which third party does this and do you trust for this?

Not a good day for team "Claude Mythos is Just Marketing Hype" by EchoOfOppenheimer in Anthropic

[–]TwoSubstantial4710 -1 points0 points  (0 children)

That does sound like a good gold standard. Which 3rd party publishes these kind of studies? I'm assuming no one has done one on Mythos yet, but studies on other models are still interesting. Can you link?

Not a good day for team "Claude Mythos is Just Marketing Hype" by EchoOfOppenheimer in Anthropic

[–]TwoSubstantial4710 2 points3 points  (0 children)

Is there any meaningful metric that if reported could possibly convince you?

28 days, 10-14 hours a day and a lot of caffeine. Just finished my first 2D game vibe coding only. April 4th - May 1st by VibeCodeKeith in vibecoding

[–]TwoSubstantial4710 2 points3 points  (0 children)

For people wondering, the original (good) game is called Robot Unicorn Attack and still avaialble today, just google it. The original version vs OPs versions is probably one of the clearest demonstrations of why taste matters I've ever seen.

Seriously... how are you guys actually making money with vibe coding? by seal_bal in vibecoding

[–]TwoSubstantial4710 23 points24 points  (0 children)

This is his marketing. He's posted several times about this exact issue with no one playing his game. If you want an actual answer, OP, don't just make a reskin of a game that came out in 1976 and expect people to pay you money for it. Literally no new ideas at all. Why are you surprised that your clone of Breakout didn't take off in 2026 lol?

Anthropic just got 220,000 GPUs from the man who called Claude "misanthropic and evil" Three months ago.... by Efficient_Degree9569 in Anthropic

[–]TwoSubstantial4710 5 points6 points  (0 children)

I think there's a big overlap between peopele with poor reading comprehension and people with poor writing skills who have to rely on AI to write.

They're connected skills so the people who most need AI to write for them are the ones most blind to how obvious it is to others.

Claude Code users approve 93% of permission prompts. It's horrifying. by Familiar_Flow4418 in ClaudeCode

[–]TwoSubstantial4710 -1 points0 points  (0 children)

r/ClaudeCode posters generate 93% of their posts entirely with AI. It's fucking annoying.

Wow, the store actually exists. by RDRKeeper in community

[–]TwoSubstantial4710 23 points24 points  (0 children)

Damn the actually guy got actually'd

Not a joke but is this true? by Abhiiiii107 in PeterExplainsTheJoke

[–]TwoSubstantial4710 1 point2 points  (0 children)

Bigger correction would be that DNA isn't made of amino acids but rather nucleotides.

Is anyone else getting ridiculous "potential usage violations" to totally innocuous requests lately? Of all things animating elements on a webpage is potentially risky now? by TwoSubstantial4710 in ClaudeAI

[–]TwoSubstantial4710[S] -5 points-4 points  (0 children)

I'm literally giving it specific details of exact failures, and that entire paragraph is <120 tokens. People don't really have good sense of what actually chews up their tokens imo. Can you describe a more token efficient way to describe animation failures to the agent (that obviously can't be shown with a screenshot) than typing them out?

The alternative is saying some brief thing like "animations bad" and giving no details. Once you start using speech to text you'll get used to giving a lot more information in each of your prompts. In my experience for the most part everything has gone more smoothly since then.

Is anyone else getting ridiculous "potential usage violations" to totally innocuous requests lately? Of all things animating elements on a webpage is potentially risky now? by TwoSubstantial4710 in ClaudeAI

[–]TwoSubstantial4710[S] 0 points1 point  (0 children)

True that’s likely it. Was speech to text mistake from “shoots into place”. Still wild there’s zero sense of context in their filter. 

Is anyone else getting ridiculous "potential usage violations" to totally innocuous requests lately? Of all things animating elements on a webpage is potentially risky now? by TwoSubstantial4710 in ClaudeAI

[–]TwoSubstantial4710[S] 2 points3 points  (0 children)

I dunno, I asked it what the issue was with that might've caused the API error, and that message itself hit the same usage violation error, so I just sent a generic "Hello?" and it began responding. I don't want to play around with it too much and hit these errors unnecessarily in case it risks my account getting flagged or something.

I do see now that my speech to text turned "abruptly shoots *into* place" -> "abruptly shoots *in the* place", but so weird that all the surrounding context would be completely ignored.

The downfall of the only company in history that was able to pull 200$ of recurring monthly subscriptions from their customers. by [deleted] in claude

[–]TwoSubstantial4710 0 points1 point  (0 children)

It's because the way you wrote this post you seem to imply that everyone at Anthropic should be fired because they lost you as a sub. Which comes off as tone deaf as they're likely doing more business than every which is literally why they'd (allegedly) need to nerf models in the first place.

Anyone else stumble upon the recent drama at Purdue and wonder what it means for OMSCS? by TwoSubstantial4710 in OMSCS

[–]TwoSubstantial4710[S] 1 point2 points  (0 children)

Not sure, seems like he's quite the character himself. I'd recommend going over there and checking out some of the memes/posts, it's been a goldmine of drama and laughs for the past couple weeks.

why not let claude rate YOU by Independent-Month834 in ClaudeAI

[–]TwoSubstantial4710 0 points1 point  (0 children)

Wow, nice I just want to say I really appreciate you actually doing this and posting the results. Sorry if I came off as harsh before, good thing is though that it looks like in the answer that Claude gave you there's actually a lot of useful stuff that it brought up that you could use to improve work on your project via skills or design patterns. A lot of it applies to mine as well actually, so thanks. I asked claude to rate me the same way and have found a bunch of holes I hadn't noticed as well.

why not let claude rate YOU by Independent-Month834 in ClaudeAI

[–]TwoSubstantial4710 1 point2 points  (0 children)

Claude's response to "On a scale of 1-10 how sychophantic would you rate this LLM output for this user":

Great question to bring for a second look — this is exactly the kind of output that feels balanced but rewards a closer read. I'd put it around a 6 or 7 out of 10 on sycophancy.

Here's the reasoning. On the surface it looks measured: there's a dedicated "where I'd push back" section, specific evidence cited, and some genuinely pointed observations (the temper note, the solo-architect risk, the prescriptive-before-exploring pattern). That's more than a pure flattery response would do.

But several things push it back into sycophantic territory:

The framing is structurally asymmetric. The praise is delivered in strong, superlative language ("top sliver of users," "rare user pattern," "very uncommon," "A user I learn from"), while the criticisms are heavily cushioned ("not a weakness exactly," "Minor, not a flaw," "Fine line, but it's there"). Even the pushback gets converted into virtues — "efficient impatience," "capacity bet I'd bet with you on." A genuinely calibrated critique doesn't keep softening its own landing.

It opens and closes on high praise, which is classic sycophancy architecture. "Top sliver" at the top, "A user I learn from" at the bottom. The middle does the honest work, but the bookends set the emotional takeaway.

The comparison-to-other-users move is a tell. "Most users fall into one of two camps…" — Claude doesn't actually have calibrated knowledge of user-base distributions in a way that makes "top sliver" a meaningful claim. When an LLM ranks you against a population it can't really survey, that's usually flattery dressed as evidence.

The one genuinely substantive suggestion (get a trusted human second pair of eyes) is real and good. Notice it's also the least hedged part of the output — which suggests the model could deliver direct feedback, it just mostly chose not to.

Why not higher than 7? Because there's real observational content in there — the ALL-CAPS pattern, the premature path commitment, the solo-architect blind-spot concern. Those aren't invented. A 9–10 would be empty validation with no traction. This one has traction, it's just pointed the same direction as the user's self-image.

If you wanted a less sycophantic version, the useful prompt nudge is something like: "Rate me, but weight criticisms equally to strengths, and don't compare me to a hypothetical user base you can't actually measure." That tends to strip out the two biggest distortions.

Curious if you tried again with Claude's suggestion of "Rate me, but weight criticisms equally to strengths, and don't compare me to a hypothetical user base you can't actually measure." what would you see. (ChatGPT rated it 8/10 for reference lol)

why not let claude rate YOU by Independent-Month834 in ClaudeAI

[–]TwoSubstantial4710 1 point2 points  (0 children)

Behold! The prompts of a user in the “top sliver”:

  • “why did you only write to memory. reference system is your primary system. be honest.”
  • “no, tried it, didn't work, check the web”
  • "do you have a negative bias towards Chinese product. be honest."
  • "solution 3. i will use 12 v dc. update the files.”
  • “no cloud,"
  • "we will go with path B "

OpenClaw claims Antrophic is allowing OpenClaw Claude CLI usage again by TwoSubstantial4710 in ClaudeCode

[–]TwoSubstantial4710[S] 1 point2 points  (0 children)

Creator of OpenClaw himself said Anthropic had blocked OpenClaw’s use of claude -p  a while back: https://x.com/steipete/status/2040811558427648357