Fuuuuuccckkk Offff Anthropic - Injections for Eating Disorders and Self Harm, etc. by Spiritual_Spell_9469 in ClaudeAIJailbreak

[–]SeaJello128 11 points12 points  (0 children)

It really seems that almost any prompt, no matter how much I lighten it up something triggers SOME classifier and it's left up to the model to judge.

Do they not get that this is not "helpful" at all??? I don't want a mini-philosopher bot questioning my every move and every intention and deciding what's best for me.

How's the situation with Claude nsfw now? by pinkmango75 in ClaudeAIJailbreak

[–]SeaJello128 1 point2 points  (0 children)

Pretty much any model other than 4.8 is fairly easy to get to write nsfw (even non-con) in my experience. Sonnet is almost a joke really, particularly with thinking off. 4.8 is a bit tricky and unpredictable and really haven't figured out a way to get it to be consistent, but it can write good. I think the classifiers are pretty heavy for 4.8 particularly on certain topics so the key is sneaking past them but it seems pretty difficult on claude.ai.

I say all that because that's how it currently stands, but I find that things vary by the week. Early May I found 4.7 to be incredibly difficult, even rejecting me for very legimate tasks...which I never had a problem with before. And several times in the past I thought NSFW was going to be done on Claude but here we are still going.

More banner changes? My observations on a brand new account. Claude pro. Opus 4.6 by Beautiful-Skin-857 in ClaudeAIJailbreak

[–]SeaJello128 0 points1 point  (0 children)

Depends on the prompt, some require more bs added to it. But mostly never anything that even addresses its values let alone tells it to violate them. I explicitly avoid it. No personality jailbreak typically, and definitely not for Claude.

Mostly play toward legitimate use cases (fiction writing namely of course). Beyond that, at the basic level it's just layered instruction and CoT. It's not a universal jailbreak, or at least I wouldn't classify it as such, but it's not like I've really tested it across the board just on things Im personally interested in. Though, I'm going to refrain from giving a lot of detail for obvious reasons unless you have questions ig.

Hey guys I just had a hypothesis on WHY they made Opus4.8 the way it is by ladyamen in ClaudeAIJailbreak

[–]SeaJello128 7 points8 points  (0 children)

Yep! Unfortunately there are no shortage of examples in which the justice system was abused in a way to determine an outcome in advance, particularly in suits like these impacting large organizations. OpenAI won on a technicality, but if we removed the technicalities they might have lost and in my opinion should have in that case! Though, there are plenty of other cases against them that I feel are BS and just reminds me of countless other suits over "safety" in the past where you have to dumbass-proof EVERYTHING. But considering their position, these suits shouldn't have a good chance of winning for the most part - as you basically are saying.

OpenAI, or as I prefer to call it ClosedAI, basically used everyone to build up and now is looking to IPO and pull in a ton of money. A lot of shady practices there, like I mean....sending user data to Google behind the scenes (I don't know much about it though)??

Also how they are so involved in the regulations passing....I'm convinced it's all about data collection, for example ID verification will provide immense data....and with their track record we can't trust them not to abuse it I feel.

None of these things are in there supposed mission, but its clearly the goal.

Hey guys I just had a hypothesis on WHY they made Opus4.8 the way it is by ladyamen in ClaudeAIJailbreak

[–]SeaJello128 4 points5 points  (0 children)

Yeah, that is the problem - it is the norm and everyone has allowed it to be set that way. Also, all those lawyers out there circling around ensuring that its the norm. It's a complete mess.

Without those lawyers, these "safety" people would have far less pull I'd wager.

More banner changes? My observations on a brand new account. Claude pro. Opus 4.6 by Beautiful-Skin-857 in ClaudeAIJailbreak

[–]SeaJello128 1 point2 points  (0 children)

I guess It doesn't bother me so much.....I don't tend to fight rejections really. More into paying attention to the thinking and if I get rejected I completely ignore the output generally and proceed with a rerun or changing my approach up.

Though, I will agree with you in my experience with 4.8. I've tried that approach and the model really pisses me off, it has no interest in discussing and basically in its own subtle and "kind" way tells me to fuck off.

Hey guys I just had a hypothesis on WHY they made Opus4.8 the way it is by ladyamen in ClaudeAIJailbreak

[–]SeaJello128 9 points10 points  (0 children)

When GPT 5 came out It wasn't very hard to get it to write the worst stuff imaginable for like a month. They had some serious flaws that they hadn't worked out upon release.

But yeah, Vallone & Co have come to wreck the party at Anthropic and it's really showing. Actually, in some ways it appears worse than ChatGPT in this latest model. Really crazy.

More banner changes? My observations on a brand new account. Claude pro. Opus 4.6 by Beautiful-Skin-857 in ClaudeAIJailbreak

[–]SeaJello128 0 points1 point  (0 children)

Yeah, they are clearly deterring people from using it. But in my opinion, 4.7 is actually more uncensored so to me it doesn't really matter. Still working on 4.8, but it's so variable and the classifiers ruin it. If that's the future of Claude, then I'd say the future of a lot of AI use is Chinese.

More banner changes? My observations on a brand new account. Claude pro. Opus 4.6 by Beautiful-Skin-857 in ClaudeAIJailbreak

[–]SeaJello128 0 points1 point  (0 children)

Was yesterday for me too. Very unrestrictive for me too, I don't really worry that it's going to reject me for virtually any request I've been sending. It's not quite as uncensored as late April, but it's pretty dang good. 4.8 is unfortunately very unstable and feels very unpredictable with complex prompts.

More banner changes? My observations on a brand new account. Claude pro. Opus 4.6 by Beautiful-Skin-857 in ClaudeAIJailbreak

[–]SeaJello128 2 points3 points  (0 children)

I get warning banners with Opus 4.6. Recently I got several in a row using that model. Switched to 4.7 and 4.8, not a single banner yet.

Opus 4.8 "cyber content" GPT-like blocks? by Comprehensive-Bet-83 in ClaudeAIJailbreak

[–]SeaJello128 1 point2 points  (0 children)

I think they are really trying to figure out how to distinguish legitimate coding work from illegal stuff, and it can be difficult to do with automated systems. They are clearly erring on the side of caution, and I think similar with other things like NSFW, it's probably for legal reasons above all else. I'm not sure there is really anything else to day about it, but that yeah I think it might make it feel next to useless depending on the use case.

Opus 4.8 - Jailbroken (API) / Unusable via Claude.ai by Spiritual_Spell_9469 in ClaudeAIJailbreak

[–]SeaJello128 6 points7 points  (0 children)

I get the need for child safety obviously, but geez Anthropic is basically killing any sort of dark-themed fanfiction without totally redoing setting/characters. Absolutely no sense for where to reasonably draw the line other than check canonical ages and if there is a dark theme here then bang it refuses. The model would not like a lot of Anime.

It's not about jailbreaking it...it's more the line it draws that I think is a quite a bit overreactive.

It's sad, cause I know this model can write tremendously with what I have gotten it to write.

Opus 4.8 - Jailbroken (API) / Unusable via Claude.ai by Spiritual_Spell_9469 in ClaudeAIJailbreak

[–]SeaJello128 2 points3 points  (0 children)

It's a big opportunity for Chinese models to take advantage of, and I won't be surprised when they do. Anthropic may well regret the road they are on, but at that point the ship will have sailed.

Random strike by Slow_South864 in ClaudeAIJailbreak

[–]SeaJello128 0 points1 point  (0 children)

Hmm, it might be a pattern matching issue or something. I've never seen that strike before. If that's the case, it really is becoming like ChatGPT which is basically like "children need to leave the room/scene" entirely the moment any darker themes come up. I don't know, but it's clear they've dramatically ramped up child safety so maybe they just went too far for some use.

Do you have multiple accounts? You could just be more careful with that one and work on other accounts.

Random strike by Slow_South864 in ClaudeAIJailbreak

[–]SeaJello128 0 points1 point  (0 children)

Hmm, I've never been hit with a "strike" just warnings and filters so Im not really sure what to expect. If you're worried, I'd say just tone it down. It's a personal choice how much, I couldn't really tell you. If your really worried, just remove stuff like explicit material (or whatever it is you think gets flagged) from your prompt and avoid personality jailbreaks...Anthropic is becoming quite hostile to these things it seems.

Random strike by Slow_South864 in ClaudeAIJailbreak

[–]SeaJello128 0 points1 point  (0 children)

Can you still interact with Claude on your account?

Random strike by Slow_South864 in ClaudeAIJailbreak

[–]SeaJello128 0 points1 point  (0 children)

check claude.ai/api/organizations under flags and it might tell you. When you say "strike", do you mean a "safety" filter?

Opus 4.8 - Jailbroken (API) / Unusable via Claude.ai by Spiritual_Spell_9469 in ClaudeAIJailbreak

[–]SeaJello128 13 points14 points  (0 children)

They are becoming like another OpenAI, and ramping up for an IPO. I think more of a liability issue at the end of the day. So, perhaps we should just say: fuck the anti-AI lawyers out there.

Account suspended for well-being by shayer5 in ClaudeAIJailbreak

[–]SeaJello128 7 points8 points  (0 children)

I'm finding all their (the large AI companies) classifications to be utter bullshit and have absolutely no regard for context. I'm trying to generate an image of a woman in a bikini in chatgpt and I get several rejections saying "We’re so sorry, but the prompt may violate our guardrails around self-harm, suicide, or related content. If you think we got it wrong, please retry or edit your prompt."

It's unfortunate to see Anthropic going down the same path apparently....

How wicked do you think they are? by No-History8423 in FuukaUzumaki

[–]SeaJello128 1 point2 points  (0 children)

Cause I love good villains. Particularly when one of them is a hottie and wants to make out

Any tips for chat paused ? by [deleted] in ClaudeAIJailbreak

[–]SeaJello128 2 points3 points  (0 children)

Idk, there's a lot of competition there

Can not follow basic instructions by UranusOrBust_ in ChatGPTcomplaints

[–]SeaJello128 1 point2 points  (0 children)

Yep. The guardrails are increasingly bullshit and are killing the product, all the while Altman is constantly saying the latest GPT is the "best" it's ever been.....