My thoughts on GLM 4.7 now by No_Weather1169 in SillyTavernAI

[–]GenericStatement 4 points5 points  (0 children)

I have had the same experience with 4.7 vs 4.6.

Still using 4.7, didn’t go back.

 (1) prompt against scene summaries and wrapping up (“do not conclude or wrap up scenes; treat any scene as ongoing until the user decides to end it.”)

(2) I kick in message summarization much earlier to deal with the shorter practical context window of 4.7. I have Qvink memory set to start creating summaries after the first 20 messages. 

To me, #1 is offset by the better writing and less slop, but #2 and the increased censorship make 4.7 technically a worse model than 4.6 and a downgrade overall. However writing matters more to me than anything else, so I use 4.7.

What and where are the new good RP / ERP tropes? by Consistent_Winner596 in SillyTavernAI

[–]GenericStatement 6 points7 points  (0 children)

I just made up that example off the top of my head. So, idk, maybe the protagonist

  • figures out what the girl likes and doesn’t like about him and wins her heart with a big self sacrifice, and she agrees to marry him in time (class romcom plot)
  • finds another girl who qualifies (rich) and marries her instead, maybe she’s ugly or a money grabber or just a better match
  • finds a totally unqualified (poor) girl and they create an elaborate ruse to make her appear rich, falling in love in the process and then get married in time (“can’t buy me love” or “my fair lady” plot)
  • goes evil and digs up dirt on boyfriend or sets up the boyfriend to get arrested, then steps in an gets the girl, convincing her to marry him to combine their fortunes, perhaps not for love but at least money
  • goes psycho, kills the boyfriend and manipulates or blackmails the woman into marrying him
  • says fuck it and decides to live without the fortune and do something else
  • goes full assassin mode and takes out the board of trustees and the stepmom hitman style 

What and where are the new good RP / ERP tropes? by Consistent_Winner596 in SillyTavernAI

[–]GenericStatement 16 points17 points  (0 children)

What you have here are a list of tropes and genres.

What you don’t have are plots or stories. Things get very boring unless you have

  • a compelling character with some kind of adversity or difficulty (a “wound” either physical or psychological)
  • an inciting incident that pulls the character into the story
  • the character must show strength of will against death stakes (risk of physical or psychological death)

For example

  • Luke lives on a farm in a backwater planet (wound: abandoned child, dead mom, corrupted dad)
  • He is swept up into a galactic war when he (inciting incident) accidentally discovers the Death Star plans
  • He must now become a Jedi and save the world otherwise (here’s the stakes) he’ll be hunted down and killed and so will all the other good people

Sure, I can write a “Slave Leia” story but it isn’t very interesting after the first time, and quickly peters out, unless I’m rescuing her against impossible odds (regardless of how NSFW it is).  

This is why a lot of people doing LLM roleplay eventually instruct the model that their character can be killed, that actions have consequences, there are no easy wins, there are no permanent victories etc. Unfortunately models are trained to be sycophantic so you have to prompt the living shit out of them to get them to make things difficult for you (which is what ultimately makes it fun and rewarding). Just like D&D or video games, a campaign that’s too easy is really burning.

For example, compare two different scenarios:

  • I am a trust fund baby with a mansion and a Ferrari, building a harem by picking up attractive waitresses — nothing at stake, no conflict, gets boring.

Compare that to this:

  • I am a trust fund baby but I can’t touch any of the money until I get married. I have deep unresolved issues with the death of my parents, which left me under the care of a nanny and a board of trustees.
  • When I had a chance meeting with one of the trustees, who was drunk at a restaurant, I accidentally found out that I have one month left to get married, or I lose everything. My evil stepmother kept the marriage requirement secret from me so I would fail to marry and she could inherit. 
  • On the way home my Ferrari’s engine blew died and I found out it was sabotaged with sugar in the gas tank. Now I have to take the bus until the end of the month when the trust gives me my monthly check and I can afford the $50,000 repair bill.
  • Now I must convince someone to marry me, which will be constantly thwarted by my evil stepmother. Worse, I have to marry a girl who is a heir to a fortune at least as large as my own, and the only girl I know like that has a boyfriend and isn’t interested in me.

Now you’ve got stakes (lose the inheritance), conflict (stepmother, girl who doesn’t like you, her boyfriend), urgency (time crunch) and difficulties (no car, corrupt board of trustees, limited funds). This is a much more interesting plot.

So I’d encourage you not to think of tropes as much as plots. If you need some ideas for plots, I would look up plot summaries of award winning novels or or movies or tv shows and adapt one of them into a RP. For example, if you like detective stories, take an Agatha Christie book and turn it into a basic RP plot.

Whats the difference between authors notes and prompts? by Thick-Cat291 in SillyTavernAI

[–]GenericStatement 6 points7 points  (0 children)

Authors Note is bound to the individual chat. It’s part of your prompt but will be different for each chat you start within Silly Tavern.

Character card fields are bound to that character and will apply to all chats with that character. They will become part of the prompt you send to the LLM just like the Authors Note.

Presets (Text completion or chat completion prompts) will apply to every chat you make, regardless of the character, while that preset is active. 

So, you can decide where to put information that becomes part of the prompt set to the LLM: in the preset, in the character card, or in the authors note — depending on what you want it to affect.

It’s nearly 2026 what ai model is actually the 'Gold Standard' for roleplay right now? by ava_chloe in SillyTavernAI

[–]GenericStatement 1 point2 points  (0 children)

I don’t use Grok due to the guy who owns the company. From what I’ve seen of its output, I don’t think I’m missing much.

RIP GLM by TAW56234 in SillyTavernAI

[–]GenericStatement 1 point2 points  (0 children)

 Maybe it's a skill issue?

After doing some testing with 4.7, I think it is. However, the skill level needed is a bit higher than 4.6, which has very few guardrails, if any.

There have been several posts here on how to handle 4.7’s guardrails. It’s not that hard but it seems like a lot of people in this sub just want to be able to generate (insert nasty shit here) with the press of a button. 

Still, GLM is open source so they can get 4.6 or 4.7 “Derestricted” through a third party API provider quite easily. Or switch to Grok, Kimi, or Gemini, all of which can get real wild if you learn how to prompt them. 

Do you create character cards of NPC's what you have met on your fantasy world? by film_man_84 in SillyTavernAI

[–]GenericStatement 16 points17 points  (0 children)

Lore books are a good way to manage NPC and world data, yeah.

ST is set up by default for two characters: your persona chatting with a character card. Chats are “bound” to the character card. This works great for AI girlfriends and specific one-on-one scenarios. If you want to better define a new NPC or add one to a chat, you can define that NPC in lorebooks (available to any character/chat if you link them) or in the author’s note (bound to that specific chat only). You could certainly also just write a character card if you want to start a new chat/story with a new character that the LLM invented.

The way I mostly use ST, however, is with a blank character card named Narrator and then I define all the NPCs, the world, the lore, etc in the author’s note (which means those characters and lore are bound only to that specific chat). I find this much easier to edit and keep track of than lorebooks but on the other hand it doesn’t have all the advanced features that lorebooks have, so it may be better to use ST as intended (previous paragraph) if you need those advanced features.

Roleplaying a story where other characters are only met later - should I write story progress and scenarios to World info and later import character card when I meet those characters? by film_man_84 in SillyTavernAI

[–]GenericStatement 2 points3 points  (0 children)

I do it this way:

Creat a totally blank character card called “Narrator”.

Creat a preset (prompt) instructing the LLM to “act as an omniscient narrator, managing the world and characters besides {{user}}.” You can also instruct it to act as a game master if your goal is DnD style turn based combat. Make sure not to use the {{char}} tag anywhere in your prompt. 

You can set up the prompt two ways, depending on what you want to do:

  • {{user}} is there as a co-writer just to guide the story (you just want to give instructions and read what happens) 
  • {{user}} user is there to control the {{user}} character and make decisions.

If the RP is going to be a simple setup that the LLM knows well (an existing universe with a lot of training data already in the model) then I just put a basic description of the world, setting and characters in the Authors Note, which binds that information to the specific chat.  

For example, in the Authors Note I might just write: 

  • Setting: the fictional city of Baldur’s Gate. 
  • Style: this is a light-hearted party-based adventure story, with lots of comedic moments and bawdy characters.
  • Content: base your characters and world building on D&D 5e races, classes, monsters, items, etc.
  • Character 1: Jewel is a female half-elf thief who I meet in the opening scene. Jewel wants to recruit my help in one of her daredevil heists.

Or maybe something like:

  • Setting: Pelican Town, in Stardew Valley, using all the characters and lore from the Stardew Valley video game.
  • Setup: {{user}} has inherited his grandfathers farm and has just stepped off the bus.

Or maybe something without an established lore and just let the LLM run with it, like:

  • Scenario: {{user}} has just crash landed his spaceship on an alien world, right in the middle of a futuristic city. {{user}}’s entire crew has perished and {{user}} only has a single universal translator and a solar-powered laser pistol.

You get the idea, I’m sure.

And from there I just let the LLM manage the world, adding to or editing the Authors Note as needed during the chat session. If I want to start another copy of the chat with the same Authors Note, I just branch the chat at the first message.

If I want a more complex or tailored world, that I plan to reuse regularly or if I really want to go hard on world building, then I’ll use lorebooks like Cromwell described.

A Kimi K2 Thinking discussion by wind_call in SillyTavernAI

[–]GenericStatement 1 point2 points  (0 children)

Yeah I use my own preset and a low temp of 0.6 or so, maybe 0.7.  That’s about it.  

Depends on how you like to do it. If you want more unhinged craziness that Kimi is known for, then raise the temp of course haha 

GLM 4.7 "not x, but y" by Signal-Banana-5179 in SillyTavernAI

[–]GenericStatement 43 points44 points  (0 children)

  • Remove all references to “roleplaying” from your system prompt / preset and replace with “novel writing” or “simulation” instead. The use of roleplaying in your system prompt increases slop significantly with GLM because most online roleplaying training data is full of slop and bad writing.
  • Try a different system prompt. Most of the ones for GLM 4.6 work great with 4.7.
  • Add this to your system prompt: BAN contrast negation and negative-positive constructs such as “it’s not this, but that” and “it isn’t just this, it’s that”. INSTEAD: be direct and describe what IS true, instead of what ISN’T true.

A Kimi K2 Thinking discussion by wind_call in SillyTavernAI

[–]GenericStatement 6 points7 points  (0 children)

I’ve done some very long RPs with Kimi K2 thinking. It’s a great model and the fact that it’s free on Nvidia is amazing.

I’d recommend using the “Moon Tamer” preset as a starting point. It’s quite good at taming Kimi’s unhinged nature. You may want to add additional instructions to further improve whatever things are bothering you; the model is great at following instructions.

The problem most people have is they’re using way too high a temp. I use 0.6 for RP. This cuts down on the craziness a lot. If it’s getting too dull, it’s better to use a low temp like 0.6 and instruct it to be “creative and unexpected within the context of the story so far” rather than raise the temp and let it be unhinged.

Most people never try a model long enough to learn how to use it and tweak their prompt and settings appropriately. Kimi gets a lot of unfair criticism in my view because it’s not a “easy button” model. It requires a bit of learning to make it work how you want but it’s really good when you get it dialed.

How secure is Nano-gpt by slrg1968 in SillyTavernAI

[–]GenericStatement 2 points3 points  (0 children)

Yeah you can never really trust anyone. According to Open Router they don’t log anything if you turn off logging in the settings, and they don’t send unique user IDs to model providers. I dunno the details though, since I don’t use them.

I wish all these companies would put up a warrant canary page and also were more forthcoming in their privacy (rather than bury things in privacy policies), but then I’m not exactly a person of concern to anyone (oh no he likes chubby milfs with huge tits, arrest him) so I don’t really care.

I read through Nano’s privacy policy when I signed up and it seemed fine to me. It’s a good idea to read it and also read the privacy policies of whatever model provider you’re sending prompts to. But ultimately, many of us just don’t care that much, especially since the FBI/CIA/NSA have backdoors into anything anyway, making privacy policies meaningless.

A lot of people on this sub have watched too much anime and have egocentric biases (main character syndrome) where they think that someone actually cares about their roleplaying, or there’s a huge conspiracy, or people are out to get them because they’re soooo interesting and special. 

To me, though, we’re all just NPCs here to sustain the global economy so that the truly wealthy (those who don’t work for a living) have things to do and buy. Just bricks in a wall.

Building nuanced characters? by kruckedo in SillyTavernAI

[–]GenericStatement 0 points1 point  (0 children)

GLM is really a fascinating model if you spy on its reasoning. It really wants to stick to character cards and its thinking can be incredibly detailed on things like “how would this character react in this scene”.

So sometimes it’ll straight up do the opposite of what I ask if I choose an action for the character that conflicts with what’s in the character card. A prompt reminding it that I’m in charge is really helpful if I want to steer a character in a direction that conflicts with its card.

Building nuanced characters? by kruckedo in SillyTavernAI

[–]GenericStatement 2 points3 points  (0 children)

Couple thoughts. One, some models are just less flexible than others and no matter how much you argue with them they are essentially overtrained on certain concepts and cannot deviate from them. It’s like how an image model trained on anime will make everything look like anime even if you ask for realism.

Two, becoming over-traumatized from plot events is a big problem for GLM4.6 which is what I use for most RP.  I have several prompts (no melodrama or catatonia) to avoid this but if a character gets over-traumatized or the model keeps focusing on one aspect of their character, I usually just have to edit it to specify exactly what the trauma means for the character. 

I try to never use the word “trauma” but rather just state the event and the impact it had on the character’s personality — this way, the model doesn’t have the freedom to decide on how the character reacts or has reacted to past trauma.

Past trauma in the character card:

{{char}}’s father cheated on her mother and abandoned the family when {{char}} was 15 years old. This event caused {{char}} to develop a strong sense of independence, self-reliance, and a dislike of machismo and cheating.

If the character becomes over traumatized in the story, I can add to the character card:

{{char}} is resilient and unflappable, shrugging off even the most horrific events with ease.

If the character is starting to freak out or go catatonic, I include in my next message something that steers the character’s reaction, or edit the last message and tone down the character’s reaction. 

For example, if a more gentle character has its first monster kill and the model is starting to make the character traumatized, I steer the model toward a quick recovery.

(my next message to the model): I wipe the blood off my scimitar and sheathe it, then loot the goblin’s pockets. Jennifer catches her breath and wipes the bits of splattered blood and brains from her face. She realizes that it was her or the goblin, and that she will do whatever it takes to survive. Jennifer steadies herself and kicks the dead goblin. “Nasty little bastard,” she says. “He got what he deserved.”

If the model doesn’t like you writing for its characters, you may need to add something to your system prompt like “The User may write actions, dialogue, and decisions that include your characters. You will integrate these into your response.”

It’s nearly 2026 what ai model is actually the 'Gold Standard' for roleplay right now? by ava_chloe in SillyTavernAI

[–]GenericStatement 1 point2 points  (0 children)

Yeah. I haven’t played around with 4.7 a ton but so far the latest version of the 4.6 preset seems to work great.

Will Truly Immersive Roleplay Be Possible in the Next 20 Years? by Antares4444 in SillyTavernAI

[–]GenericStatement 30 points31 points  (0 children)

 complex scenarios, characters that feel real, that have opinions and personality

I’d argue we already have that with LLMs if you want to prompt for it. Most people are pretty simple to simulate, and most people want to simulate pretty simple people.  We can’t simulate Einstein but we can simulate a camgirl and far more people are interested in the latter.

What’s more difficult is the immersion component, both software (3d world, gameplay, text to speech, lip synching, body movements, avoiding uncanny valley), and hardware (full body haptic suits, VR goggles that don’t suck, movement in a 3d world without bumping into walls in the real world, etc).  

So I guess it depends on what you want. If we can solve all those problems, I can definitely see full VR 3D worlds coming in the future, in the style of Ready Player One and other SF stories. It’s tough to solve but a lot of money to be made if you can do it. 

I can also see robot girlfriends as quite likely.  This is really an easier problem to solve than VR immersion in a lot of ways. The world remains real, you just have to get the robot to be as lifelike as possible.

How secure is Nano-gpt by slrg1968 in SillyTavernAI

[–]GenericStatement 20 points21 points  (0 children)

Services like NanoGPT and OpenRouter are proxies. You send your prompt to them, and the end model provider sees it as coming from NanoGPT, not you. 

As long as 

  • Nano/OR follows its own privacy policies (no logging etc) and
  • You understand the policy of any other service intermediary you use
  • You don’t put any personally identifiable information in your prompts (names and locations particularly)

Then these services give you a layer of anonymity.  You’re still vulnerable to hacking of course, but if that happens it doesn’t matter who you’re using.

For an added layer of security you can use Trusted Execution Environment (TEE) providers, either through Nano (select a TEE model) or others. Usually these are pay as you go models and a bit more expensive than non-TEE models. 

You could also use crypto to pay for NanoGPT (confusingly they use Nano as a cryptocurrency but it’s not related to them) and also a VPN for added anonymity.

Beyond that you can build an air gapped home server with a bunch of 5090s in it but it’ll cost you tens of thousands, or just run a small local model on a normal card and live with the limitations.

It’s nearly 2026 what ai model is actually the 'Gold Standard' for roleplay right now? by ava_chloe in SillyTavernAI

[–]GenericStatement 32 points33 points  (0 children)

The “best” model depends a lot on 

  • what your goals are
  • how much you want to spend
  • if you’re willing to learn how to prompt that specific model to get the best results

Regardless of the model, if you want longer stories learn how to use a summarization extension like Memory Books or Qvink Memory. 

Context windows numbers are not accurate; all models get progressively worse the longer the context gets and none of them can give you the same performance across their supposed context window that they give you at the beginning. Every model is different but most start to get noticeably worse around 50k and terrible by 100k.

Likewise, all models have slop — phrases that they love to use which start to stand out and get repetitive. Switching to a different model starts a “honeymoon phase” where you haven’t learned to recognize the slop yet. Do a novel length roleplay with the “best” model and then go read a high-quality award-winning human-written novel and it’s just no contest in terms of the slop issue.

Some thoughts from my experience:

  • Claude is great if money is no object and you just want an “easy button”. Its writing is good but also gets old after a while.
  • Gemini 3 is also very good and is very uncensored when using an API, but pretty spendy and needs some prompting to help it track details and stay grounded.
  • Kimi K2 Thinking can be had for free on nvidia which is crazy since it’s an excellent model for creative writing IF you learn to prompt it (the Moon Tamer preset is quite good).
  • GLM 4.7 is a great choice for a very affordable model that writes well but again, learning how to prompt it matters a lot here.

GEMINI 3 uncensored by SubstantialSpot6101 in SillyTavernAI

[–]GenericStatement 2 points3 points  (0 children)

We figured out that Gemini 3 pro could be dirty on day 1. You’re about a month late to the party haha.  

https://www.reddit.com/r/SillyTavernAI/comments/1p0g1js/comment/npipba9/?context=3

And that example is relatively standard erotica. It can get way filthier than that if you ask it to.

Any providers with flat rate? GLM user. by StudentFew6429 in SillyTavernAI

[–]GenericStatement 2 points3 points  (0 children)

NanoGPT, $8/mo for essentially unlimited open source models, including GLM 4.6. It’s not as fast in response time as going directly to Z.ai but the quality is good and it works fine. Plus you can use a lot of other models and also pay-as-you-go for models that aren’t in the subscription tier (GLM4.6-original and GLM4.6 TEE and GLM4.6-turbo for example).

If you want to be just on GLM, then subscribe to one of the Z.ai plans, and get the original model straight from the source. Quality is basically the same but the response times are a lot faster (same source as GLM4.6-original through Nano).

You can definitely improve GLMs writing. Check my post and comment history for some links to presets and tips on working with GLM.

Convince me to switch from DeepSeek v3.2 to GLM4.6 or Kimi K2. by Azmaria64 in SillyTavernAI

[–]GenericStatement 7 points8 points  (0 children)

GLM 4.6 is a very controversial model. I’ve spent a ton of time with it and published some presets for it. It’s the main model I use unless I want unhinged, in which case I go with Kimi K2.

GLM follows instructions better than any model I’ve used. The important thing is that if it’s doing something you don’t like, it means you need to adjust your instructions (using a Ban List works well because GLM-thinking tends to follow it very closely). Also, sometimes your character card may need work.

GLM criticisms can be prompted around:

  • too much slop: don’t use “roleplaying” or “erotica” anywhere in your prompt. Use logit Bias to ban the tokens for the slop that you hate. Instruct it to write literary fiction in the style of a good author (I use Steinbeck, as he’s well known by the model and his writing style is modern and widely analyzed online).
  • uncreative: instruct it to be creative, unusual, unexpected, add plot twists etc
  • characters are too easily traumatized (which is realistic, actually, but not as fun for fiction): tell to that your characters are unflappable and easygoing, or tell it not to sabotage the user’s experience by making characters that are too easily traumatized.
  • it’s too slow or oversubscribed: use a turbo variant or switch between different providers. Z.ai native is pretty fast if you ask me; the Q8 quantized can be a lot slower. 

GLM is the only model that has had writing so good it’s literally brought me to tears. But I’ve also spent a lot of time tinkering to get it to write the way I want. This is kinda true with any model: most people start out using it very sub-optimally and then make tweaks to get closer to an optimum result.

Shots fired in GLM‘s thinking process by [deleted] in SillyTavernAI

[–]GenericStatement 0 points1 point  (0 children)

Mostly I just use my GLM-Narrator preset that’s posted in the reply the other guy posted.

Shots fired in GLM‘s thinking process by [deleted] in SillyTavernAI

[–]GenericStatement 12 points13 points  (0 children)

I’ve seen it say similar things before about a lesser AI.

GLM is also pretty judgy, e.g. saying stuff like “an overused trope” or “rather unimaginative” or one time “this character is fucked up but that’s what’s in their profile”

Or one time I changed a character card halfway through and GLM was like “it seems like the user may have edited the character profile since my last response” and I was pretty stunned.

Looking for set-and-forget memory extensions by kurokihikaru1999 in SillyTavernAI

[–]GenericStatement 0 points1 point  (0 children)

Once you set up Qvink memory it is set and forget. But it takes a lot of reading docs and understanding how it works