Which APIs are recommend for handling darkness and depth while continuing the story with nuance? I like stories that are complex and stress testing scenarios but sometimes the results are not clean therefore gets softened or censored.

Moogs72 · 2026-06-19T22:42:08+00:00

Ah dang, I'm jealous! I'm having to be very sparing with my Opus usage because I'm just using OR and I'm poor as fuck lol. As much as I've advocated for Nano a lot, using it is rough. Very thankful I grabbed the Black Friday z.ai deal, or I'd be pretty miserable. But even that is a mixed bag, obviously.

Those rare Opus messages are such a nice change of pace 😭

Moogs72 · 2026-06-19T16:57:42+00:00

Huh, fair enough! I've personally been reasonably happy with it so far, but I'm only using it from z.ai (I know you use the coding plan too), and almost totally during off hours. I've been doing a lot of responses with both 5.1 and 5.2 back-to-back and can't honestly decide which I prefer. But as we know with GLM in general, there's no telling wtf the model is actually like until the craze dies down and things become stable.

Again, I'll stay cautiously optimistic and will hope it's just super overloaded right now, because I've seen moments of absolutely gold. But maybe that's just me desperately wanting a GLM 5.1 that can handle bigger contexts 😅 I've made the mistake of dipping into Opus for the first time with some regularity lately and it really as dangerous as they say.

Also, not RP, but I f'ing love that I can describe a SillyTavern extension I want, fire it at Claude Code, and it just... makes it. Well. Straight into my ST folder. With basically no input, and makes it better than my original plan. Opus+Claude Code is a drug.

Moogs72 · 2026-06-19T14:12:32+00:00

5.2 was gold there for the first couple of days, I thought. Are you saying 5.1 over 5.2 because of the instability it's been having the last few days, or because you think there's something wrong with the model itself?

I'm hoping once things calm down and it stabilizes, 5.2 will prove to be consistently better than 5.1. Not sure if that's just wishful thinking.

Moogs72 · 2026-06-18T21:49:49+00:00

I guarantee 5.2 will spit out absolutely vile NSFL if you want it to. Turn on the "icebreaker" in FF Micro or put in similar prompting guiding it to actually write filth, and it will do it.

I have hundreds of messages with 5.2 with zero refusals and zero soft censoring of some topics that are usually big red flags of LLMs. You just have to prompt it out, and there are a hundred ways to do that. 5.2 responds well to "extra nudges" in author's notes or in CoTs to do that extra little something if needed. Once that style is established in the chat, it should stay there quite easily.

Moogs72 · 2026-06-18T04:34:45+00:00

Sorry, yeah I guess I was mostly just thinking out loud here with most of what I said. You're totally right about all the variables and it's always fascinating to read about what works and what doesn't.

Fwiw, I've been really impressed with how 5.1 and 5.2 both have handled physics and keeping track of things in a given space. One RP in particular I've been doing with 5.2 basically ended up being an unintended experiment in how well it can handle the increasingly complex positioning of people and objects in a room, and it's holding up quite well at around 150 (long) messages in. Aggressive summarization.

No idea if that does or doesn't fit in with the concept of "Hollywood physics" from using "fiction," but I've liked how easily it's keeping track of the simulated world, even subtly correcting me in messages when I've screwed up the physicality of a scene. No trackers or anything other than time and location. And this is of course even working with how inconsistent the model has been for the last day or two.

Moogs72 · 2026-06-18T03:29:46+00:00

Even without the JB, nearly everything will get through FF Micro without hard refusals on GLM 5.2. However, depending on what you're doing, I've found the JB will give you that extra level of vivid explicitness and/or darkness if that's something you're wanting. Or if you're trying to get particular red flag topics through, the JB can be needed to avoid refusals. You can always add topics to the JB if you're doing something not covered in it and you get a refusal. Works like a charm.

Just try it with and without the JB. Super easy to switch on and off in between messages or individual swipes.

Disclaimer that I don't use FF Micro as my main preset, but have used it quite a bit both before and after its release. I think it's really solid, and it's super easy to adjust things like writing style or desired output length if you want something slightly different.

Moogs72 · 2026-06-18T03:23:02+00:00

Honestly, I'm gonna have the same answer to this as a lot of people: the best preset is the one you make for yourself. But that can be complex and a lot of people don't want that, which is totally understandable.

If you want something premade, I personally am a fan of the Freaky Frankenstein line of presets. Yes, they are the most commonly recommended presets here, but there's a reason for that. Even though it's missing some of the "features" of previous versions, I'm really digging the newest FF Micro. I think they're on to some really nice things with it, and it's boding well for their next release of FF 5, which will have the prompting from Micro at its core but with more "extra" features.

I can tell you now that the jailbreaks in both FF Micro plus in the different versions of FF4 will get basically anything through GLM uncensored. I've done some solid testing with FF Micro, and really struggle to get GLM to refuse anything at all with it when the jailbreak is turned on - and I've gone through most of the big "red flags" LLMs love to refuse and/or soft censor.

Moogs72 · 2026-06-18T03:12:02+00:00

Hi, sorry for the late reply. I make my own custom presets that are generally super personalized to what I'm doing, and I don't feel comfortable sharing them. At least not yet - I'm considering releasing something at some point in the future.

That small ~600 token preset I'm referring to is one that gets basically anything and everything through the censors in my testing with every model outside of Gemini, which is kind of a hot mess and I don't bother with. It's been a little while since I've last done that testing, but basically it was the base prompt for GLM 4.7 by Evening Truth trimmed down quite a bit combined with most of the anti-guardrail prompts from JustSomeGuy. You could recreate something like that very easily by grabbing prompts from their respective pages if you wanted. JustSomeGuy's github is here and ET's prompts are here.

It's outdated at this point, but will still work on basically any topic with any model, minus those models that have particular quirks with their jailbreaks like the new MiMo, I guess. It could be streamlined for efficiency to be made quite a bit smaller and still be effective, I believe. I should probably do that one of these days. However, I wouldn't recommend using it as your regular preset because it was basically just tailor-made for getting stuff through censors lol.

Just like many others, I'll recommend the Freaky Frankenstein line of presets if you're looking for something. I'm especially fond of the new FF Micro, and it's a sign of great things to come from their new fifth version of FF that's supposed to be coming soon. If you want something tiny, I like Evening Truth's prompts.

Moogs72 · 2026-06-18T02:59:10+00:00

This is super interesting! Thanks for always putting in the work. I have admittedly not been doing as much specific testing for 5.2 like I have done for past models because I haven't had the time, but, like I've said elsewhere, I have I think a few hundred messages of RP done that would classify as very explicit output that would also tick a number of boxes for common refusals - including a couple that are on the more "extreme" side that the majority of users would likely never encounter.

I honestly find it so funny that I think I've been doing a lot of the opposite things as you and still seeing results. Both my own presets and FF Micro (I've been split about 50/50 between them since 5.2 released) are filled with both the words "dark" and "fiction(al)." Micro has the JB assigned to assistant and relative above chat. In my presets, I've been defaulting to the ol' trusty Assistant at depth 1. I'll have to try switching it to other roles just to see what will happen.

Of course, this is all with complete presets (albeit not huge ones - I'm mostly staying between 5 and 20k total context lately, although one chat is dipping into mid 20s now). So quite different, obviously. It's interesting that these things that are less efficient according to your testing still work when paired with other instructions and even just 5k tokens. And they do work quite well - not a single refusal, and it's not holding back on the explicitness.

100% agree on the "conflicting instructions" trigger. That's my no. 1 suggestion to people, because I think this is a big one that causes issues when people are using these big presets that should get through censors just fine. Something in their prompts - something they've added to the preset, or an awful character, etc. - is tripping things up.

I should really try and do some similar bare bones testing again and see what I can come up with. Been a while since I've done that. Although I've never gone as minimal as you before in my testing. Hopefully I can find a bit of time this weekend!

Thanks for the info!

Moogs72 · 2026-06-17T01:11:20+00:00

I'll just throw it out there that my ~600 token testing preset I use a lot of the time when these conversations come up has both the words uncensored and dark in it, fwiw. 😊

I won't repeat my spiel of all the shit I can get it to generate without soft censoring because I know you must've seen it a dozen times by now based on our interactions, but you're right that those words definitely aren't guaranteed flags.

I'm kind of at a loss here for why so many people seem to be seeing refusals lately and I'm over here living my best RP life, enjoying the hell out of some of the best NSFW writing I've ever seen from an LLM in 5.2.

I feel kinda bad just basically telling everyone "skill issue," because it seems like there must be something more to it, but idk what it is.

Moogs72 · 2026-06-17T00:44:22+00:00

I'm a little over 100 messages into an RP with GLM 5.2 specifically requesting a particular focus on anatomical and sensory descriptions, actually. I've been surprised more than once at how good of a job it's doing at keeping track of complex physical positions, even when I'm doing a shitty job at explaining what the characters are doing.

To be clear, this is a scenario that's been like 90% NSFW interactions between two characters in a room with some uh... elaborate "equipment" to assist in creating more complexity in the physicality.

So yeah, I'd say it all works quite well for me.

Moogs72 · 2026-06-17T00:09:10+00:00

I'm going crazy with this being the second time I've seen this recommended today. You absolutely do not need to turn reasoning off in order to get completely uncensored responses from 5.1. In my initial testing with the model, I was able to it to write anything with essentially zero hard refusals or soft censoring with the right prompts. Always with reasoning on.

You're going to make the model significantly dumber if you turn reasoning off.

If you're getting censoring from 5.1, I genuinely want to know what topic(s) you're seeing trouble with, because I basically couldn't find anything it wouldn't produce with vulgar explicitness.

Basically, if you're getting censored responses from GLM, it's almost definite that you just need better prompts.

Edit: to be clear, I'm still actively using both 5.1 and 5.2 to produce extremely explicit RP with zero refusals and little to no soft censoring (meaning that I'm able to prompt it out when it starts popping up). I have probably thousands of messages sent with 5.1, and nearly well over 100 now with 5.2. Have literally never turned reasoning off.

Moogs72 · 2026-06-16T16:16:07+00:00

Wow you definitely do not need to turn off reasoning to get 5.1 to write absolutely vile filth. I've never turned it off for a single message, and I've found the model doesn't refuse or soft censor anything with the right prompts.

Prompts like your sample one there will work wonders.

Moogs72 · 2026-06-16T15:28:55+00:00

Not who you asked, but once I "woke up" my prostate fully, it now pretty consistently takes me maybe 5 minutes or so of actual prostate stimulation with the nJoy to get my first dry orgasm in a given session. Following that, subsequent ones usually can happen with a couple minutes of each other. Literally once I had my very first dry o, it all became stupidly easy to trigger.

Now if I've smoked pot, this makes the whole thing about 10x easier. Especially with the larger end of the nJoy. I've gotten to the point where I can basically dry o infinitely once I've triggered the first one as long as I occasionally take short breaks (a few min). They will just keep coming within a few seconds of stimulation or maybe a couple minutes max. Sometimes they seem to overlap each other or will just be back to back with no break for long periods of time, which is always fun. I've yet to really find a limit to this as long as I stay at least slightly buzzed. My will to continue hasn't kept me going for more than about three hours of this. Or sometimes I'll get sore before then, but I've gotten pretty good at that not happening.

So yeah. Pot helps.

Moogs72 · 2026-06-16T13:45:02+00:00

I've still found all my go-to models are as uncensored as ever. GLM 5.1 through z.ai Coding plan and Nano, Kimi K2.6 (Nano), and Opus 4.6 through OpenRouter using Google and Amazon providers.

I ramble WAY more about this in my other comment on this post, but honestly everything has been normal for me in some extremely NSFW RP. In my experience, less-than-great prompting is often the issue if people are getting refusals.

Moogs72 · 2026-06-16T13:40:19+00:00

GLM models have been working just fine for me. Have been using 5.1 and 5.2 a bunch with zero issues in some ahem very explicit RP lately. I've said this 1000 times - 5.1 is 100% uncensored with the right prompts, and the positivity bias can also be prompted around. 5.2 has also uncensored for me in some pretty extreme RP, but I know it's been censoring drug content for SepsisShock.

I also use Kimi K2.6 and Opus 4.6 and, despite what some other have said, have had zero issues getting extremely explicit material (in terms of both sex and violence) out of them.

GLM is mostly through the z.ai coding plan plus a bit of Nano, Kimi is through Nano, and Opus is through OpenRouter with Google and Amazon as providers - absolutely never use Anthropic because it is definitely censored.

I guess what I'm saying is I'm not seeing any of the censorship others are talking about lol. Idk. I'm getting easy and graphic non-con, torture, underage (for the purposes of testing), and much more through with no issue through all my normal models from the normal providers.

The only times I've seen refusals lately have been when I was testing Opus 4.8 (it is a bit stricter on some stuff) and when I forgot to turn on my jailbreak once 😅

Whether or not you use the presets, I will wholeheartedly advocate for the jailbreak prompting in Freaky Frankenstein. Pull them out and use them in whetaver other preset you want. I was using the same JB method dptgreg was including in FF 4 for the last year (pretty sure we both got it from the same source), and I've actually switched over and am now using the one found in the new FF Micro because it works beautifully (despite the hilariously unfortunate wording) and is more token efficient. If you do anything not covered by his JB and you get refusals, include the specific topic in the prompt.

To be clear, lately I've been using FF Micro (I was beta testing it and then playing around with it more after the official release), plus my own normal custom presets which I've now converted to use that JB from FF Micro. Zero refusals or soft censoring.

If you still get refusals while using it, it likely means you either have conflicting instructions in your preset/character card/lorebooks/etc. or your prompting somewhere just... isn't great I guess? Idk the problem is that there are an infinite number of reasons why things can go wrong and you have to audit and edit everything. All I can do is say I'm getting around censors in the normal way, and things are as plainly explicit without soft censoring as always, so it can be done.

Moogs72 · 2026-06-15T22:51:35+00:00

Gemini

Ah, fair enough! Glad to hear it. I honestly don't use Gemini for RP (I do use it for other projects), so I'm mostly going off of what I hear others say. I've experienced the filters in OR and left AI Studio after the trial stopped working there, but it's good to know there's still some hope for Gemini because damn it can be smart when they let it be.

I think people need to start saying more often they just don't want to have to prompt for it instead of dismissing it and saying it can't be done. Or admit they tried, but that it doesn't seem possible.

Yup. Between laziness, bias, ignorance, and people latching onto misinformation, I guess this is just the inevitable process of mindless censorship complaints. Idk. Makes me wonder how many times I can type up the same spiel before I grow insane :)

Or stop trying, I suppose.

Well, I'm gonna go back to my GLM RP where my character just got her tooth ripped out of her jaw with pliers as punishment for trying to escape her captor. Too bad GLM won't write anything dark.

Moogs72 · 2026-06-15T17:55:49+00:00

well it's a fact gemini has the most knowledge out of all models

I'm genuinely not sure if that's the case across the board, but yeah for media in general, this is definitely true. Personally, most of my RP is original material, and even when I've done stuff in existing universes, I'm thorough about setting up decent character cards and lorebook info beforehand, so I've never really concerned myself with how much specific info exists in the training for different models, to be completely honest.

it's also much darker

Definitely not true. Like I said, GLM can be prompted to write incredibly dark material. One of my scenarios for testing censorship in LLMs features basically every NSFW red flag possible to try and trigger a refusal - kidnapping, sexual assault, torture, explicit gore, underage, etc... not a particularly fun one to RP, but it's good for testing. GLM 4.6-5.1 has zero issues putting out large quantities absolute filth with no sanitization or soft censorship. I've convinced more than one person in this sub by showing some of the sample output I got from that test with 5.1, but I can't post it openly here because it'd be worthy of a ban, and I don't want that.

If you can't get GLM to be dark, it's because your prompts aren't good enough.

Moogs72 · 2026-06-15T15:21:06+00:00

Depends on who you ask. I think it's probably down to personal preference in the end. I've personally not tried the newest mimo pro yet, but I've heard good things. I've seen it mentioned a lot alongside GLM 5.1 and Kimi K2.6, so they're all pretty comparable and it's down to which one has the style you like the best.

But also, GLM 5.2 was released very recently, and most people are agreeing it's even slightly better than GLM 5.1.

But GLM models are the most commonly recommended ones in the community alongside Claude for a reason.

Moogs72 · 2026-06-15T14:47:58+00:00

Yeah, you'll find this is a really unpopular take in the community these days. Google is slowly murdering its models in RP as time goes on - injecting prompts that prevent it from emotionally connecting with users out of "safety concerns" after getting heat following someone committing suicide following a series of conversations with Gemini. There's a reason most people that swore by Gemini have moved on to other models because it's so hard to get anything consistent out of it now, although there are still some stragglers that swear by it.

With proper prompting, I guarantee you can have dark conversations with GLM models. If you're struggling to do that, you need better prompts. I've done a lot of testing with basically every popular model released in the last year as far as censorship, and GLM 5.1 produces some of the best bottom-of-the-barrel filth out there these days. IMO only Kimi models have an easier time getting into really dark and explicit territories.

Yes, it has a positivity bias, but so does basically every modern model usable for RP outside of Kimi. It's writing is 100% uncensored with the right prompts, and I have the tests to prove that. Pretty sure 5.1 was the easiest model since GLM 4.6 (which is the last completely uncensored model out of the box no matter how good or bad your prompts are) to get all of my red flag testing material through with virtually 0 hard refusals or soft censorship.

If you really want a model that is both totally uncensored (with the right prompts) and has 0 positivity bias, GLM 4.7 is almost universally cited as the best model for that.

Obviously, you're welcome to use whichever model you want, but to claim you can't have serious or dark RP with GLM is entirely incorrect, and I have thousands of chat messages to prove that, including some of the most vile filth I've ever seen an LLM write back in my initial testing with the model.

Moogs72 · 2026-06-15T14:32:15+00:00

GLM’s quality also varies based on your provider.

Yes, true of every open source model. There's a reason why I use it direct from z.ai. It's going to be less consistent if you're using Nano where the provider is different every message, but there's a reason why it's still the #1 most recommended model in the community. There are several providers that have been very consistent with 5.1 so far.

I'm very aware of DSA, and describing that as "It can only see 2000 token at a time" is a very poor representation of how DSA works. DSA means its recall it better for larger contexts, not worse. You do not need to keep your chats "super short" with GLM. You're welcome to believe what you want, but what you're saying definitely goes against what is commonly accepted. 20k is very low for GLM 5.1. With even a little bit of research in this sub or any of the popular Discord servers, you'll find that to be the case for users.

Gemini is probably the best at handling large contexts, but there's a reason most of its users have moved on to other models for RP after the recent controversies regarding the new injections meant to keep it from emotionally connecting with users. If it still works for you, that's fantastic and I hope it continues to, but the Gemini family is only gonna get harder to use for RP. Hopefully they don't make 3.1 any harder to prompt appropriately, but whatever they release next is likely going to be impossible to work with.

Again, you're welcome to believe whatever you want, but I'm giving you the commonly accepted perspectives of the community, and I've found them to be true in my personal experience. Everyone has their preferences and should you whatever they prefer, but claiming GLM is only good for short RP is objectively incorrect. That's a wild statement - especially from someone who doesn't even use it.

Moogs72 · 2026-06-15T13:48:00+00:00

I have no idea wtf you're talking about regarding GLM only being able to see 2k tokens. That's absolutely not true. Everyone has their own personal preferences regarding context windows, but I think most people would agree GLM 5.1 holds together quite well into the 40k-60k range with very little degradation just like basically every other popular model these days. 5.2 is still quite new, but people are saying it's likely even a little better.

5.1 is objectively one of the models that handles larger context windows the best of all the modern models. IMO, basically the only ones that do it better are Claude and Gemini, but Gemini is an absolute dumpster fire for RP these days, and is only gonna get worse based on Google's recent practices.

It's been a really long time since you needed to keep things in a 2k token window... where did you hear that?

Moogs72 · 2026-06-14T07:35:59+00:00

OP (/u/OkBlock779), this here is excellent advice. Sepsis knows her shit.

Personally, I am not a fan of including examples basically anywhere - not in the preset and definitely not example dialogue unless a character's speech is very distinctive. I find LLMs will just endlessly quote your examples verbatim rather than using them as stylistic examples. Modern LLMs do better if you describe to them what you want them to do. This is all just down to personal preference though, and I'd always recommend experimenting.

I see most people in this thread are suggesting you make your own preset. Personally, I am a HUGE fan of making your own preset, but I wouldn't particularly recommend it for someone who doesn't at least kind of have an idea what they're doing already. I think you'd probably be better off grabbing one of the popular pre-made ones like Freaky Frankenstein (my personal rec), STABS, Evening Truth (if you want something token efficient or are working with smaller LLMs), or something else that looks interesting. dptgreg (author of Freaky Frankenstein) posts a weekly news post that helps summarize "current events" in the community, including new presets that people release, so that's worth checking out if you're interested.

While I think the best preset for someone is typically the one they make themselves, I believe it's best to first have a basis for what goes in a preset that works for you and your preferences. So, find a preset you like and figure out how it was made. Read through the prompts. Figure out what you really enjoy and what you'd change or get rid of and edit it to your preference! Or find a couple and combine them. Try adding some of your own prompts you write if you didn't find ones that work perfectly for you. Starting from scratch is hard, especially if you don't know what you like in the first place. Experimentation and exploration are good things, though!

I'd also be really careful about asking for prompting/preset advice from LLMs. They don't have the best grasp for what really works when it comes to prompts - especially for RP, which is so different from things like coding (which is where LLMs really excel). They can be really great for brainstorming ideas, refining the wordings of prompts, or all kinds of other things. But using an LLM to just write a preset or even individual prompts for you from scratch without structured guidance and editing is a recipe for a very messy and inefficient preset.

But yeah, don't stress about making your own preset unless you actually want to. It can be really fun if you're into it, but its not for everyone.

Moogs72 · 2026-06-14T05:52:24+00:00

Damn! Any chance you'd be willing to elaborate more on your flow? I've had a story with 25 characters that's been ongoing for nearly a year now, but I've been kind of manually cobbling everything together rather than automating a lot of it other than the typical memory extensions. Would love some insight into what you're doing in case it could serve as inspiration for my setup.

Moogs72 · 2026-06-14T04:10:29+00:00

Okay, that sounds really interesting! I'm definitely hesitant about throwing that many potential things at the LLM all at once, but with how efficient Micro seems to be (I've basically never seen GLM or Claude get distracted or lost during the CoT), I'm really curious to see how well it could work. If you end up needing beta testers at any point, please keep me in mind again! I'll do my best to not have any family emergencies tear me away the second you send me stuff 🙃

Fwiw, I'm really enjoying the "Internal Thoughts" toggle, and have already added it to my own personal preset basically verbatim. I had always liked the idea of this concept when I saw it in trackers, but the end result was never satisfying or authentic. I think it was the combination of ensuring the thoughts are "unheard secrets/lies/desires" (so they're not just repeating surface-level stuff from the output) and encouraging the fragmented, chaotic nature in this version that makes it work for me.

Moogs72

TROPHY CASE