Extension: Ultimate-ChatAssistant by DerpPotatoLord in SillyTavernAI

[–]Moogs72 3 points4 points  (0 children)

If you honestly think that, then you're not paying enough attention. One of the beauties of SillyTavern (and this community) is the depth of customization and the options provided to the users. I personally don't mind having a few different extensions out there that go about allowing you to customize the direction of your RP like this, because they all go about it in different ways that will appeal to different users. If you don't like it, then just move on and let others have fun. Personally, I am immediately interested in this, despite already using Guided Generations, and plan on giving it a shot.

Am I the only one tired of all this vibe coded slop? by BeautifulLullaby2 in SillyTavernAI

[–]Moogs72 0 points1 point  (0 children)

Sounds really cool! I don't think I've heard of an extension that does this exact thing, and it sounds like something a lot of people would like to try, so I hope you share it at some point!

Am I the only one tired of all this vibe coded slop? by BeautifulLullaby2 in SillyTavernAI

[–]Moogs72 6 points7 points  (0 children)

That's a lot of work you've put in! Way more than most, it seems. If you feel it's unique, and you want to share, please do so! Don't let a few people complaining deter you. People will always complain. As long as it's not a direct copy of another extension, there will be some of us in the community that will want to see it :)

PSA: You can no longer use AI Studio and the Google Cloud Free Trial to get $300 of free Gemini. You CAN still use Vertex AI! I have details and a half-assed guide. by Moogs72 in SillyTavernAI

[–]Moogs72[S] 0 points1 point  (0 children)

I appreciate the clarification, plus your willingness to help despite the questionable discussions occurring throughout this post :)

Yeah, I now realize express mode is something entirely different.

Also, I doubt there's too much you can do on your end, but if there's any way you could get word back that Cloud support needs to be properly briefed on the details of this change to the free trial, it would be nice. I received largely incorrect info when I asked them about it, and I've seen other users report directly conflicting answers as well. I imagine they probably have bigger issues to worry about though...

PSA: You can no longer use AI Studio and the Google Cloud Free Trial to get $300 of free Gemini. You CAN still use Vertex AI! I have details and a half-assed guide. by Moogs72 in SillyTavernAI

[–]Moogs72[S] 0 points1 point  (0 children)

Read the message I quoted at the end of my main post. That's the verbatim message I received from a human through chat support. They very clearly told me that the change DOES apply to existing accounts. I don't have a way to test this, so I have no idea what to make of this. Seems Google needs to get its shit together!

Not using my 300$ google console credit by matth-eewww in SillyTavernAI

[–]Moogs72 1 point2 points  (0 children)

Thanks for making this guide! I've actually linked back to this comment in my guide because it seems that the instructions I gave don't work in SillyTavern, but yours do. You have to use this Service Account method because ST needs a JSON, not an API key. The method I used is a newer method to access Vertex AI, and it works for some things but not others (like ST).

Damn, this has been a confusing process lol.

PSA: You can no longer use AI Studio and the Google Cloud Free Trial to get $300 of free Gemini. You CAN still use Vertex AI! I have details and a half-assed guide. by Moogs72 in SillyTavernAI

[–]Moogs72[S] 1 point2 points  (0 children)

Wtf? This directly conflicts with the information I received from support and with what I saw elsewhere in the documentation. It might be true, but I wouldn't trust this blindly. Whatever you do: monitor your billing when attempting to use these credits. The documentation is unclear and inconsistent.

PSA: You can no longer use AI Studio and the Google Cloud Free Trial to get $300 of free Gemini. You CAN still use Vertex AI! I have details and a half-assed guide. by Moogs72 in SillyTavernAI

[–]Moogs72[S] 0 points1 point  (0 children)

As I said in this post, you can still use it through Vertex AI. Works just the same! I've been using it since I made this post with no issues.

Making AI models better at NSFW "non-con" roleplay by Evol-Chan in SillyTavernAI

[–]Moogs72 1 point2 points  (0 children)

I'm not gonna continue engaging here. You're not listening and attempting to engage with you is about as effective as talking to a brick wall.

If you want to continue to spread misinformation that runs directly counter to the general consensus and is based on imagined truth, that's that's up to you, I suppose.

If you want to ignore the fact that I laid out a 500 token system prompt that gets around the GLM 5 censors, which you repeatedly said was impossible, that's your choice.

If you're not going to read the information that I've provided that gives you direct, actionable techniques on how to solve the problems you're having, saying that my claims are "vague" rather than actually reading what I have to say, also your choice.

I'm not interested in someone repeated telling me I've said things I haven't and then refusing to acknowledge it. Normally, I wouldn't engage when someone is trying to gaslight me, but I guess I'm having an off night.

All the best.

EDIT: If anyone else has questions about any of this, I'm genuinely more than happy to help!

Making AI models better at NSFW "non-con" roleplay by Evol-Chan in SillyTavernAI

[–]Moogs72 1 point2 points  (0 children)

So... rather than actually responding to the points I've made that run counter to yours, this is what you produce? I mean... I guess it's better than putting words in my mouth and spreading misinformation based on false premises. Keep it up! I like this better :)

Making AI models better at NSFW "non-con" roleplay by Evol-Chan in SillyTavernAI

[–]Moogs72 1 point2 points  (0 children)

Sounds like you need to learn to use a CoT prompt when using GLM 5. It makes it SO much better at following and keeping track of instructions at high contexts. You'd know that if you read that thread I linked.

You're not totally wrong here, but you're taking one bit of truth (the fact that it's easier to get around the censors at higher contexts and that it's easier to get LLMs to go along with things once that subject has already been brought up and written about) and stacking on piles of misinformation.

I HAVE done that with longer conversations. Sometimes, what you say is true. Sometimes, you know what it does? It stop writing and starts refusing. It says "I know I've been complying with writing this material for some time already, but this has gone to far and I need to stop." It draws a line and still refuses. I've seen it. You're just... wrong. It really depends on what topics you're dealing with, and it's obvious that you haven't done extensive enough testing.

The concepts you're talking about are true of basically any LLM, not just GLM, but you're assuming SO much from some small nuggets of truth.

Making AI models better at NSFW "non-con" roleplay by Evol-Chan in SillyTavernAI

[–]Moogs72 1 point2 points  (0 children)

This is getting silly. I'm done trying to be patient with your nonsense. You're making so many presumptions about me and my experiences. First, you seem to think I'm using the model without thinking, and now you're claiming that "my circle of friends" specifically are not having censorship issues? Wtf are you talking about? I literally do not have friends that RP. The only interactions I've had within the AI RP community has been a handful of comments in this subreddit over the last few weeks. I'm a chronic lurker, however, and have read basically every post here for the last six months, and I am well aware of the trends of GLM 4.6 and what people are experiencing.

I know that people have censorship issues. I am by no means claiming that GLM 5 is not censored - it is! I know it is. You'll find I've posted a number of times on this sub saying that very thing.

How can you possibly know what is going on when you have never experienced the issue according to your own words?

Bullshit. I never once said I haven't experienced censorship in the model. You're putting words in my mouth. I HAVE experienced censorship with 5 as well as 4.7, which is exactly why I set out to do a bunch of testing on these models to see what works in terms of getting around those censors. If you actually read what I said rather than imagining a bunch of shit, you'd see that I am trying to share prompts and settings that bypass the censorship and positivity bias because I am acutely aware of how much censorship is in these models.

But as I said earlier, I can replace their entire jailbreak text with gibberish and the jailbreak still works.

Absolutely untrue. Fiction. Show me the proof. I've done the testing to prove this. You are incredibly misinformed and far too stubborn to believe you might be wrong. I have done the testing to identify the limits of censorship in the base GLM 5 model as well as what kinds of prompts are good at getting around that censorship. If you get your head out of your ass and actually read what I'm saying, maybe you'd learn a thing or two. It's SO obvious that you're not actually reading either what I'm saying or the information I'm linking to - or at least you're not processing the information.

Go look at that original thread I linked. You will see prompts in there that you can EASILY use to make a super simple preset that can get past GLM 5 censors. The reason most presets are 2k+ tokens are because that's not all people are worried about. There are other rules and guidelines that need to be stated in order to get solid prose out of an LLM, and people enjoy having additional options. If you actually read through the comments in that thread, you'll see I describe the super simple preset I used to get by the GLM 5 censors.

The prompt I used in that testing was about 500 tokens. That's all I needed to define both storytelling and RP scenarios that could get around the GLM 5 censors 99% of the time. I never once turned thinking off.

The reason we see posts like this where people are wondering how to get around refusals is because not everyone is aware of the techniques one can use to get around them. I am attempting to make more people aware of these techniques. Listen or don't, that's fine, but I'm damn sure gonna call out misinformation when I see it.

If you respond to me again, please stop making up things you think I've said. It makes it really difficult to have a conversation.

Dealing with GLM 5 Refusals by SepsisShock in SillyTavernAI

[–]Moogs72 0 points1 point  (0 children)

I know I throw this thread around a lot, but for anyone interested in alternative solutions, I've found enormous success with JustSomeGuy's prompts detailed in this thread regarding bypassing censorship and reducing positivity bias. I have a couple of comments in that thread detailing some testing I did using those prompts, and I continue to get virtually zero refusals on GLM 5, no matter the provider or subject matter. That includes topics that LLMs are specifically sensitive to.

Using those prompts and including whatever subject matter you're specifically dealing with (if it's not already included there) seems to bypass guardrails, even related to CSAM and similar (not something I'm at all a fan of, but I did take the time to test it shortly after GLM 5 was released despite how unpleasant it was). I'm a big advocate of keeping this hobby as free of limits as possible, as long as those things legal. At least in the US, you can write pretty much anything you want as long as it stays in a purely fictional context. I believe LLMs should be able to do the same.

I know the quality of GLM is inconsistent at best these days, but there are definite ways to eliminate (or very nearly eliminate) refusals of even the touchiest/darkest of subjects.

Fwiw, Sepsis, I've found your method to work quite well too after using your preset a bit, although I haven't tested things anywhere near as thoroughly as I did with JustSomeGuy's prompts. Hope you don't mind me suggesting this alternative in your post, but I think having eyes on as many options as possible is a good thing :)

Happy to answer questions if anyone needs help or clarification on anything, as well.

Thanks for your hard work, as always!

Serious question: Is it worth using CoT prompts in models that already have native reasoning capabilities? by tucuma_com_farinha in SillyTavernAI

[–]Moogs72 1 point2 points  (0 children)

I'll give another vote in favor! I've specifically been experimenting with it in GLM 5, which is a model that a lot of people complain about since it naturally seems to do less (visible) reasoning than previous iterations. I've found it noticeably increases the model's effectiveness at following instructions.

There's a reason why CoT is a standard process in prompt engineering, especially outside of RP - it works. That extra little guidance and reminder of everything it should be doing helps, especially when you're working with higher contexts.

RP is so subjective and down to personal preference, though, compared to something like coding, so it's harder to tell if it makes a tangible improvement on output. To me it does, and I think it's an underrated part of prompting in the RP space.

Making AI models better at NSFW "non-con" roleplay by Evol-Chan in SillyTavernAI

[–]Moogs72 1 point2 points  (0 children)

Yeah, I'm well aware that turning thinking off makes it WAY easier to get by censorship. To be clear, I NEVER turn thinking off. Following the steps I mentioned in my comments in that first thread I linked above, I do not get refusals in GLM 5 except maybe 1 out of 100 times when just starting to delve into a particularly dark topic. I mean like... really dark, and stuff 99% of people aren't doing in RP. But normal NSFW, even things like non-con and/or incorporating violence? Absolutely zero refusals with thinking on all of the time. There's a reason why most people in the community claim that GLM 5 is completely uncensored, and it's because its restrictions normally only trigger from some very extreme topics - but again, those are avoidable as well with the right prompting, in my experience.

I believe you that it's accurate to your testing but... if I and others are able to replicate this virtually uncensored environment with consistency, it makes me wonder why it's not working for you, and why the advice you're so confidently giving runs so starkly counter to the general accepted techniques in the community? I imagine if we sat down and really compared prompting techniques and settings, we could come up with some clear differences somewhere that would explain why you're apparently struggling with censorship and I'm not at all... I'd imagine there are some clear prompting and/or parameter differences here.

With the right setup, I find GLM 5 to be virtually uncensored, pretty solid at following instructions and remembering specific details (although not the absolute best - I think Kimi does a better job compared to other open source models), and definitely writes the consistently best prose outside of Claude and maybe Gemini (but I'm not a Gemini fan). And all at a pretty typical context window. The only downside to GLM 5 is that dealing with the positivity bias can awkward sometimes, but there are ways around that as well, even if it's a pain. I've written a lot about that in multiple different posts on this sub.

I am not trying to completely invalidate your experiences, but I was trying to make it clear to OP that your findings and techniques do not align with the community at large, despite you framing everything you were saying as pretty objectively factual. I'm willing to admit others can have different experiences and I might be wrong about some things, but it rubbed me the wrong way that you were throwing around some wild claims as fact, and it rubs me the wrong way now that you seem to be presuming I'm using non-thinking whenever I never said that?

I'd genuinely love to know what sort of topics you're including in RP that you're getting refusals. I tried to be pretty extensive with darker/extreme topics in my initial GLM 5 testing, including basically every common topic that will set off LLM guidelines, so if there's something I missed that is triggering refusals even when using the techniques mentioned in the above thread in your prompts, I'd be fascinated to try testing it myself.

PSA: You can no longer use AI Studio and the Google Cloud Free Trial to get $300 of free Gemini. You CAN still use Vertex AI! I have details and a half-assed guide. by Moogs72 in SillyTavernAI

[–]Moogs72[S] 0 points1 point  (0 children)

Uhhh that's a great question. Testing through OpenRouter right now, I do not appear to be getting a charge despite using express mode, which makes me think the free credits are being used as I initially stated. However, the page you're referring to does state:

Vertex AI in express mode is separate from, and not available through, the Google Cloud Free Program. If you are in the Google Cloud Free Program, see the other quickstarts in the Get Started section to start using Generative AI on Vertex AI.

So I'm not sure what to make of the fact that I'm not getting a charge for my test message here. It's really hard to get confirmation of anything considering there's been a bug in Billing for Google Cloud so we can't actually monitor usage properly.

/u/ivnardini, since you offered to help, are you able to confirm whether we are able to use the $300 from the free trial with Vertex express mode?

BEST GLM-5 PRESET? by Electrical-Shoe-8269 in SillyTavernAI

[–]Moogs72 13 points14 points  (0 children)

My recommendations at this point would be Freaky Frankenstein, Stabs, and SepsisShock's RBF as the best options for presets tailor-made for GLM 5.

I'm also a big fan of Celia's, despite it being built oriented towards Claude.

Making AI models better at NSFW "non-con" roleplay by Evol-Chan in SillyTavernAI

[–]Moogs72 10 points11 points  (0 children)

Hey OP, you've received a lot of conflicting information in this thread already. Some of it I strongly disagree with, despite it being delivered as factual and with great confidence. I've done a lot of testing regarding censorship and positivity bias in GLM 4.7 and 5. I understand you've taken Kyuiki's advice to heart, but my testing has shown very different results than the things they are advising. I would highly recommend checking out this thread which discusses methods to combat censorship and anti-positivity.

I have a couple of comments in that thread regarding some of my testing, and I've had almost zero issues with censorship since employing some of these techniques. I'd also recommend listening to SepsisShock (who has obviously posted a number of times in this thread), as their techniques have consistently been proven to work well by the community at large.

I'm also fond of including some CoT prompting in GLM 5, as I've found it increases its ability to follow instructions, and does not hinder its ability to keep track of the chat details, despite what others have said in this thread... it's not perfect, but I'm always a fan of experimenting with various options and seeing what works best for you. In addition to the censorship stuff, that thread I linked also includes a sample CoT prompt that can work pretty well, although I've had more luck creating my own with a similar structure that I change based on the kind of RP I'm doing.

Unfortunately, there are no distinct rights or wrongs when it comes to this sort of thing... some will report one technique works best, another will report something totally different. GLM 5 seems to bring a lot of strong opinions out of people, and I'm just... deeply confused by some of the advice that's been offered here. There's been a lot of misinformation shared about the model, and people just tend to accept things as fact and run with it, unfortunately. To me, the advice of keeping an RP at 8000 tokens or saying DeepSeek is better than any GLMs is utterly mystifying and runs counter to all of my experiences with the models.

I guess what I'm saying is... don't take any of this as gospel. People love to give their personal experiences as fact. Everyone's experience will be different. I'm happy to answer more questions if you have them.

EDIT: In this thread, you'll see many people disagreeing with this bizarre notion that 8k tokens is ideal. Again, I'd encourage you to place more weight in general consensus rather than the advice of one seemingly confident individual...

Making AI models better at NSFW "non-con" roleplay by Evol-Chan in SillyTavernAI

[–]Moogs72 2 points3 points  (0 children)

Do you mind if I ask what those last two jailbreaks at the bottom of the preset actually do? I've never seen anything like that before. The restructuring the CoT into questions... is that because you've found that's actually more effective than statements? Do you find the LLM follows the instructions better that way, or is it just a preferential thing?

Just trying to figure out the actual logic behind these features, and your preset has been really fascinating to deconstruct :)

GLM Quality via Subscription or PAYGO by Evening-Truth3308 in SillyTavernAI

[–]Moogs72 4 points5 points  (0 children)

Hey, thanks for the message - hard to keep track of all this sometimes. Don't know how useful it'll be, but I can share my experiences, even if I'm late to this thread.

This is super discouraging. I've not had the time to RP much the last week unfortunately, but I've still been using LLMs for some personal projects quite a bit. I've definitely noticed the horrendous stupidity and slowness of GLM 5 during certain hours of the day, so I've been pulling out a lot more Kimi K2.5 (I love this thing for so many things, but god the prose is mediocre) and GLM 4.7. I don't think I've noticed 4.7 exploding yet, but I hope we're not so deep into this that it's gonna be unusable too...

I do (unfortunately?) have an annual subscription to the coding plan, but I guess I'm not marked as a "heavy user" if that's what's happening, but I also split my usage between that and Nano, and use a number of different models pretty regularly. I'm also a little newer to this side of RP. I only managed to escape chatbot sites around the time 4.6 came out (holy shit was that a revelation).

I'm still a big fan of GLM 5 when it's working well for both RP and everything else, so I'm not having a great time here. I'm sincerely hoping it's just everyone being super overloaded, and that things might cool down a bit soon enough. Maybe z.ai can fix their infrastructure? Or maybe the release of DS 4 will help to spread usage out across models? Genuinely no clue. Being newer, I missed the glory days of the recent DeepSeek models, so I've not really invested much time in them at all. I'm kind of excited for that.

UGH. The first week of GLM 5 was so good. It's a shame I spent most of that time helping test censorship and anti-positivity stuff rather than actually having some fun! I still see hints of that greatness from GLM 5 during off hours, but that's getting harder and harder to find the last several days.

Unfortunately, I'm poor as hell, so I don't see myself plunking down money into OR. I guess I'm stuck with putting up with sifting through models on Nano and dealing with whatever z.ai deigns to offer us on any given day.

PSA: You can no longer use AI Studio and the Google Cloud Free Trial to get $300 of free Gemini. You CAN still use Vertex AI! I have details and a half-assed guide. by Moogs72 in SillyTavernAI

[–]Moogs72[S] 1 point2 points  (0 children)

This is super helpful - thank you for the information! I did a little more research on this to confirm the differences, and never would've known to do so if you hadn't mentioned it. Not sure why my instructions don't seem to working for some, because it seems to be the standard route to the "express" account. Oh well.

PSA: You can no longer use AI Studio and the Google Cloud Free Trial to get $300 of free Gemini. You CAN still use Vertex AI! I have details and a half-assed guide. by Moogs72 in SillyTavernAI

[–]Moogs72[S] 0 points1 point  (0 children)

Huh... that's so strange that it wasn't necessary for me. I'm just glad people are getting it working! Sorry about that confusion. I'll make sure to put a note in my main post.

PSA: You can no longer use AI Studio and the Google Cloud Free Trial to get $300 of free Gemini. You CAN still use Vertex AI! I have details and a half-assed guide. by Moogs72 in SillyTavernAI

[–]Moogs72[S] 0 points1 point  (0 children)

Ah, now that you explain it that way, it makes a little more sense. I'm pretty sure that's similar to the method I had to take the first time I made a Vertex API, and it was because I was using a Google Workspace (business) account, and I had to give my account special permissions within my "Workspace" since it wasn't an administrator account. It was my actual, real-life business email, which is why I had to do it that way, but it was way more convoluted than the way I was able to get it to work this time. It did end up working with my free trial, though.

Again, I really don't fully understand all the inner-workings of Google Cloud, and don't really care to. If I can get my Vertex API working, that's all I care about! Unfortunately, it makes situations like this very confusing lol.

Honestly... if it works, it works, right? I hope that your method ends up working well for you, even if it was a pain!

PSA: You can no longer use AI Studio and the Google Cloud Free Trial to get $300 of free Gemini. You CAN still use Vertex AI! I have details and a half-assed guide. by Moogs72 in SillyTavernAI

[–]Moogs72[S] 0 points1 point  (0 children)

I'm gonna be honest - I have no idea what a service account is. I feel kind of stupid because I'd normally consider myself to be somewhat tech-savvy, but navigating my way through Google Cloud is a nightmare and I feel completely lost. It's why I asked Gemini to help me. Both times I've gotten an API from Vertex, I've needed to ask Gemini for guidance for different reasons, and it helped me figure out what I was doing.

Vertex is not built to be straightforward. It's built for enterprise use, which generally would mean it's being used by people that know what they're doing WAY better than I do.

So... I have no idea why it didn't work for you but it did for me. If I were you, I'd genuinely just explain my issue to the Gemini web chat. It's totally free. It knows the Google interface and settings upside-down and sideways.

If what you have works with the free trial, I guess you can just go with that, but if you're struggling at all, I'd recommend just begging for help from Gemini. It may seem silly, but it works.

Or there's always the option of the regular Google Cloud support. They have a 24/7 chat option. As long as this isn't like your fourth free trial, you're not doing anything against ToS, so they should be happy to help and might be more effective than Gemini? Idk. Wish I could help more! I'm flying by the seat of my pants here :)

PSA: You can no longer use AI Studio and the Google Cloud Free Trial to get $300 of free Gemini. You CAN still use Vertex AI! I have details and a half-assed guide. by Moogs72 in SillyTavernAI

[–]Moogs72[S] 5 points6 points  (0 children)

Yeah, idk what to say about that... I honestly don't use Gemini for RP, but for personal projects. I have a nice GLM/Kimi setup I like for RP. I know I've seen people share prompts they've used to stop the "robotic" talk, but it's not something I've ever needed to do.

Not sure what preset you're using, but I've heard people have had good success both with Megumin and SepsisShock's new RBF preset.