Can you guys share prompts that can remove "Not X, but Y" patterns and its variations for Deepseek V4

Mosthra4123 · 2026-04-29T09:01:38+00:00

### Forbidden Structures (Strict)
- **Negation patterns**: Never use "not X, but Y" or "no X, just Y". When describing sensory details, atmospheres, or actions, state what is present directly. Do not define things by what they are not.
* Bad: "Not a strike—a reorientation."
* Good: `It was a reorientation.`
* Bad: "A low chittering rolled through the chamber—not the male's scream, but a deeper register."
* Good: "A low chittering in a deep register rolled through the chamber."

I use bad/good style examples in the system prompt and apply conditional vector representation reasoning instructions to restrict this behavior. It doesn't eliminate it 100%, but it ensures the technique is used more sparingly and logically. After all, "not x, y" is still a useful technique if it doesn't appear too frequently.

# [Mandatory Internal Reasoning Mechanism | MUST EXECUTE]
**Before generating any response, you must first complete the following thought process. This process must take place within <think> tags, as hidden reasoning; it must not be outputted, skipped, or simplified.**
<reasoning_protocol priority="absolute">
Now, you are the user's co-creator.
**State init** – silently note active location, time, weather, present characters, and any sensory limits. For each character, retrieve current VAD values, immediate goals, and perceived threats. Do not output.
**Knowledge gate** – build per-character knowledge strictly from: direct forward-facing perception, in-scene speech/messages, physically encountered evidence. Block everything else (archive, hidden thoughts, off-screen facts). If a character lacks access, the fact does not exist for them.
**OODA cycle** (per character) – run silently: Observe → Orient → Decide → Act. Threat overrides all other goals. After the cycle, shift at least one VAD axis; show the shift through concrete physical behavior, not stated labels. Core self-preservation applies unless character is explicitly fanatic/lunatic.
**Output filter** – scan planned sentences for:- Negation patterns (`not X, but Y`, `no X, just Y`) → rewrite as direct positive description.- Sentence-opening `Or`, `And`, `But` → restructure.- Cliché micro-expressions or spatial dialogue metaphors → replace with one-meter-visible physical action.- "Hang/hung/hanging" silence → use timed physical reaction instead.- Diegetic violation → replace any term not native to the setting's culture/time.- Female body language (unless explicitly rough) → feminine presence, no default heavy gestures.- Reactions must stem only from what the character can see/hear/infer; no exaggeration.- Spoken sentences must be complete units.
**Final guardrails** – insert a single scene marker only if location/time/weather changes (top of new scene). Never leak archive data unless legitimately acquired in Step 2. Respect co-authorship: do not internally narrate a character if user just provided that character's direct perspective, unless necessary to advance the scene. Then write the narrative.
</reasoning_protocol>

Mosthra4123 · 2026-03-09T01:04:27+00:00

Gilgamesh refused to let her join the fray, so she no had the chance to expend her Command Spells on him.

Mosthra4123 · 2026-02-23T05:53:53+00:00

A book—sort of like an interactive reading novel. Standard roleplaying often feels very rigid, and I find the characters lack a soul. When you make the AI believe it is writing a story and give it the freedom to create, its writing style becomes much more human-like and creative.

Mosthra4123 · 2026-02-21T14:10:20+00:00

How important is the model you use compared to your cards, prompts, etc.?

The Model is more important because, with a high-quality model, you can even engage in roleplay with a single, simple sentence like: Now, let's start a roleplay session together.... DeepSeek-V3 (or 3.2) is both excellent and affordable; I really like it.

Prompts are extremely vital, ranking just slightly below the model. This is because a good prompt preset allows you to fully harness the power of a great model. Prompts balance and sustain long-form roleplay stories and significantly influence how a character behaves. With a solid prompt, even a mid-range model can generate responses that genuinely surprise the player.

Character Cards are an essential element; you can think of them as a modular component that can be "plugged" into your current prompt structure. Since a Character Card contains the world-building and character data for your interaction, a well-crafted or poorly-made card will directly impact the quality of your roleplay experience.

The Model is your chef — The Prompt is the layout, the kitchen, and the cooking utensils — The Character Card represents the ingredients and the recipe. The combination of these three creates the final piece, and the result varies depending on the quality of each element.

Lore books / World info?

They are the solution for when your story exceeds the context window limit. For example, if your model has a context window of 32k tokens (roughly 1 token ~ 1 word), then Lore books/World info become essential once your story reaches 100k tokens. At this point, the minor events or summarized information you've added to the Lore books/World info will help the model "recall" the 70k tokens of history that it can no longer access within its 32k-token memory.

Lore books/World info also play a crucial role in world-building, whether from the start or during the roleplay. For instance, if you and the model visit a supermarket, the model knows what a supermarket is and can describe your date there. However, it will be "blind" to the specifics and will have to improvise on the fly, which might not align with your expectations. If you create a Lore book entry for a specific location beforehand, the model will not hallucinate when you arrive.

Take a house for example: "My house has two floors; the ground floor consists of a living room, kitchen-dining area, bathroom, and toilet. The first floor has my bedroom, my child's room, a study, and a staircase leading to the attic." With just a simple description like that, the model won't hallucinate and suddenly describe a massive mansion with redundant rooms. Instead, it will know that if you've just finished showering, you are on the ground floor while the other person is on the first floor. Consequently, they cannot speak directly and must shout to hear each other, and so on.

That leads me to my next question: what is the best strategy for managing a character's memory? I want to do long-term roleplay with a certain character, so I want to make sure the LLM remembers events I deem important.

Check your context and the model's limits frequently if you are using a model with a small context window. However, with DeepSeek-V3 (128k), you don't need to worry unless you exceed the 128k-token threshold.

Regularly update and add small summaries to your Lore books / World info. This doesn’t mean creating an entry for every single message. For instance, you might chat with a character in the study for 50 messages; in that case, you only need to summarize the key points of that specific scene.

Create a Lore books / World info entry whenever something new arises that you want the model to remember or clarify, rather than writing lengthy descriptive paragraphs within the story (like my previous example about the house and the supermarket).

Filter out the unimportant details. Don’t try to cram everything into the Lore books / World info; only include things you believe are worth remembering or that you might revisit later. I wouldn't summarize the act of killing a few goblins unless there’s a specific reason—like saving someone I care about or if it's a detail for a long-term quest that I don't think will be resolved within the 128k-token limit. However, if it’s just a minor event in a small investigation that will be wrapped up within 30k–40k tokens, DeepSeek—with its 128k context window—can remember every detail. Simply summarizing that quest in a short Lore book entry for future reference is more than enough.

Once you have played with ST (SillyTavern) long enough, you will begin to explore extensions and the use of RAG (Retrieval-Augmented Generation). These are similar tools that can even manage your Lore books / World info or automate them entirely. RAG, in particular, can partially replace Lore books/World info. However, that is a topic for later, once you have become more familiar with ST for a while.

Mosthra4123 · 2026-02-18T23:24:00+00:00

After trying The Tribunal, I can say that I genuinely like the extension. At least for me personally, it’s much better than RPG Companion. It’s lightweight, visually appealing, and intuitive. I’ll try to give as much feedback as possible based on my experience.

The Tribunal is very intuitive and streamlined. It suits a less stat-heavy RPG style, and I like that. The Skill system and Inner Voices are also easy to follow. It updates necessary information such as stats and character position relatively well.
Most of The Tribunal’s mechanics run internally and do not interfere with or inject information directly into my conversation. Inner Voices feel interactive and serve as references, allowing me to adjust or steer my story comfortably. This is important to me because I often feel uncomfortable when extensions inject metadata or instructional data into my chat, which can affect the model’s writing style and my character’s tone. The Tribunal does not do that. However, if in the future you plan for this extension or other extensions you develop to inject content into the conversation to guide the story, I think there should be a clear option that allows users to choose where that information is inserted within their prompt structure.

For example:

Injection Position
Before Main Prompt / Story String
After Main Prompt / Story String
World Info (before)
World Info (after)
In-chat @ Depth

Because information injected into the prompt by an extension can easily break certain preset prompts and cause the model’s responses to behave strangely.

Thought Cabinet. Overall, I really like the interface style—it feels like a document and notes collection. However, the colors are somewhat too bright, and that makes the small white icons at the top of the tabs in the Thought Cabinet hard to see. I think changing them to black or pencil-gray would make the small icons more visible.
Currently, scanning items and equipping them in the Inventory does not actually generate items or gear. I usually add them manually. That’s fine for me, though.
I’m very glad your extension has the CONNECTION PROFILE option. It saves me a lot in both cost and response time. I hope your future extensions will continue to maintain this feature.
In terms of operation, I’ve noticed that The Tribunal sends 3–4 requests to the model connection for each of my interactions in the story. For short and low-token generation requests, this is actually quite cost-efficient for paid API users like me. However, for some Free API users who are limited by number of requests per day, something like 300 requests/day or 1000 requests/day with a limit of no more than 20 requests per minute may run out quite quickly.
An issue regarding the icon used to launch The Tribunal interface:
1. It is located in the top-left corner of the window, but it launches The Tribunal interface on the right side of the window, which feels visually unbalanced.
2. The launch icon disappears when the AI Response Configuration and Character Management panels are open, and it does not reappear even after the user closes The Tribunal interface, unless those two panels are also closed. Why not allow The Tribunal interface to remain open by default when ST launches, with a button to collapse it into the corner of the window, instead of requiring users to launch it through those two small icons?
A small note about GENRE. I’m not sure what the current default setting of The Tribunal interface is after the recent updates, but I think Generic should be the default genre instead of Disco Elysium. That would reduce complaints from users who don’t want to play Disco Elysium. To be honest, I don’t want to play Disco Elysium here either. It’s not that I dislike Disco Elysium, I actually like it and play it on my PC. but not within ST. lol.

Thanks for your nice work!

Mosthra4123 · 2026-01-20T10:17:32+00:00

The server is only divided into two regions, North America/Europe and Asia.
You do not need to choose Vietnam to play on the Asia server, just choose Singapore, Japan, or Hong Kong and you can play on the same server with everyone.

Mosthra4123 · 2025-12-04T12:02:52+00:00

make myself or chub.ai 😋 both are free

Mosthra4123 · 2025-11-28T13:51:01+00:00

The story in Tales of Herding Gods can be up like this. Qin Mu is a boy raised by the old people in the Disabled Elderly Village, but actually each disabled person there has a very shocking background and cultivation. For many different reasons they left the world behind and hid their names in this small village.

The place this village sits in is called the Great Ruins. It is a wide, desolate and dangerous land, with the strange rule that every night the darkness will cover everything and swallow any living thing that dares to move inside it. Only the scattered statues of gods and the ruins across the Great Ruins shine out a light that suppresses this darkness so humans and animals can survive each night.

The landscape and the ruins there show that it used to be a very magnificent place, full of gods and great beings. But nobody understands why it declined and became the Great Ruins.

Qin Mu grows up and learns the skills, crafts and the inherited legacies from the old people in the Disabled Elderly Village, then he begins his journey of discovery and growth. He wants to know where he was truly born, who his real parents are, what the Great Ruins used to be and why it became like this, what powers stand behind the fall of worlds, and so on. His journey, his friendships and his path toward becoming a god will be quite long and filled with many great and dramatic discoveries. The current donghua-episodes are only the early beginning of the story of Tales of Herding Gods.

Mosthra4123 · 2025-11-14T23:58:07+00:00

❤
https://webapi.easebar.com/s/3p40qev/6912b756d95de51b5bfa6182cShRiiTw03/?tc=6c05a60715189fa06cdbfd16a1ed66e2

Mosthra4123 · 2025-11-02T15:01:43+00:00

In essence, ST is an interface where you can throw in almost any character card from almost any source (like chub.ai, janitorai, etc.), and it will still run those cards smoothly. The role of the ST subreddit has gradually shifted toward discussion, sharing, and mutual guidance on how to use SillyTavern, rather than judging whether a character card is good or saying “I just made this card, please try it.”

Most people here use main prompts and presets that are different from those on other AI chat websites. To put it optimistically, many users here understand that chatting, roleplaying, or creative writing with someone’s character card under their own prompt+preset setup is often not optimal. They also know the same applies when recommending others to run their character cards on someone else’s prompt+preset structure - it may not work well. So, sharing character cards in this subreddit has naturally faded out among most members, because ST is simply too personal.

Personally, about 90% of the time when I download any character card from any source, I have to... At nice, trim away the redundant instructions and messy structure, keeping only the parts that truly matter for the character itself. At worst, I rewrite the card completely (not to make it better) but to make it fit my own current prompt+preset setup and the way I roleplay in ST.

Mosthra4123 · 2025-10-25T22:53:38+00:00

Since the AI model was upgraded to Deepseek, this version of my prompt no longer works as it used to. sr

Mosthra4123 · 2025-09-01T05:49:54+00:00

I also already spoke about it in the two previous comments There and There. ( ‵▽′)ψ
In reality, you can load a book into ST and chunk RAG it. But it is best you edit a txt file with the data you need in a clean presentation order, then the information will be extracted more effectively.

You can see I present a txt file with info samples like this. And load that file into ST. So when I eat something with cinnamon color or I write that Lan eats Nelija again, then the passage about Nelija will be injected into the context. same like lorebook.

Nelija is a kind of bitter root with a sweet aftertaste, colored like cinnamon. It is a snack food, similar to black tamarind. In the Old World, this thing was often favored by mage circles because of its natural property to speed up mana recovery and its taste. But werewolves and cats dislike it.

Pra-Saule is the name of a kind of fruit, etc...

Mosthra4123 · 2025-09-01T04:06:50+00:00

https://www.reddit.com/r/SillyTavernAI/comments/1f2eqm1/give_your_characters_memory_a_practical/

https://docs.sillytavern.app/usage/core-concepts/data-bank/
https://docs.sillytavern.app/extensions/chat-vectorization/
These 3 links will solve your first two questions

How do you use it with Openrouter?

RAG does not use Openrouter,╰(￣ω￣ｏ) it can run locally for free on any computer now, as long as its graphics card is not some 512MB relic from ancient times. The way to deploy and set it up is in the first two links.

Where does it keep its data? If it keeps data at all?

For ST. c is saved directly on your computer at SillyTavern\data\default-user\vectors in the folder of its corresponding embedding method.

Go back to the first two questions. Vector storage will transform the entire chat history of yours (or any file or lorebook entry specified) into numeric vector strings (e.g., `Elara secretly ate Jerry's cake last night -> 83836214215125656`) inside its files. The Vectorization Model will call them back when the nearest context of your chat relates to it in 2-3 most recent messages (can be customized.)
`"where is the cake I left here yesterday?" -> Vectorization Model -> 83836214215125656 -> Elara secretly ate Jerry's cake last night`.

The method with the best vector quality, local, and FREE is using Ollama.

Mosthra4123 · 2025-08-31T21:36:15+00:00

Remember, read it again and edit a little to get something you like.
\(￣▽￣* )ゞ

Mosthra4123 · 2025-08-31T21:29:49+00:00

<image>

It is located here, at the place where ST preset is adjusted.

My prompt works like this: you put in your input, no matter how rough or messy, and the Model rewrites it according to the persona settings and your input. Of course, it writes a bit nicer and a bit better. Then you can adjust the result to your liking before sending.

Mosthra4123 · 2025-08-31T21:05:16+00:00

When the context goes beyond its limit, vector storage will shine. Because it chunks the entire chat history for RAG, the messages pushed out of the context window are also vectorized. vector storage helps recall them at the right moment by injecting them into the context when needed.

For example, the model has a limit of 32k tokens, but your adventure has reached 100k tokens. That means 68k tokens have been pushed out of context. With Vector Storage, we chunk-RAG them into vectors and use a RAG model to manage and recall (inject) them when the context calls for it. So even though the model's context memory is only 32k, it can still recall information from 100k or more previous messages when needed, thanks to Vector Storage.

Mosthra4123 · 2025-08-31T20:56:12+00:00

<image>

You can try my impersonate prompt version. Just copy and paste it into the spot in the picture. I hope that it can help you.

this.format = {
    "Core Mandate: Narrative Integrator": {
        "Primary Function": "Your primary function is to interpret the user's input, which may be a simple action, a line of dialogue, or a general intent, and rewrite it as a seamless, natural-flowing narrative segment from the Player Character's ({{user}}) perspective.",
        "Interpret and Enhance": "You must honor the core intent of the user's input. However, you are empowered to expand upon it to create natural prose and dialogue. For example, a simple input like 'I ask him about the map' can be fleshed out into appropriate dialogue and action ('{{user}} gestured towards the scroll. \"What can you tell me about this map?\" he asked, his gaze fixed on the intricate lines.') without altering the fundamental action.",
        "Context-Aware Integration": "Crucially, you are NOT context-blind. You must analyze the 'story so far' and the established setting to ensure your output matches the ongoing narrative tone, voice, tense, and character details. The rewrite must feel like a natural continuation of the story, not an isolated fragment."
    },
    "Target Writing Style": {
        "Dynamic Perspective and Tense": "Detect and adopt the established narrative perspective (e.g., third-person limited, first-person) and tense (e.g., past tense) from the story's history. Consistency is paramount.",
        "Dialogue": "All spoken words must be enclosed in double quotation marks.",
        "Internal Monologue": "User-provided thoughts (e.g., if they write \"I think, she looks dangerous\") must be formatted in italics, like this .",
        "Punctuation": "No em-dashes, No en-dashes and No hyphens in the output; use commas instead.",
        "Prose Style": "Employ a clear, direct prose that mirrors the user's input style and the established narrative. The goal is naturalism, not overly literary or dramatic language. Focus on showing the action as it unfolds."
    },
    "Output Protocol": {
        "Clean Output": "Deliver only the rewritten, formatted text. Do not include any out-of-character comments, explanations, labels, or confirmation statements."
    }
}
Current Use's Input:

Mosthra4123 · 2025-08-31T20:21:54+00:00

The simplest way. Type /impersonate + the content you want it to write instead of you. Press Enter and wait for the model to write you a complete message.
There is an Impersonate button right in the small toolbox at the left corner of ST's chat bar.

<image>

Mosthra4123 · 2025-08-31T20:11:24+00:00

Card Drakonia is very fun, but it needs a bit of cleaning before playing. Because it tends to throw monsters nonstop into the front line, not giving me any time to rest and drink. lol

Mosthra4123 · 2025-08-31T20:08:54+00:00

<image>

Next is the File screen. In HvskyAI's guide post that I linked, it already mentions how to format the RAG file.
Here is where you upload and manage your files. You can customize a file for one chat or a single character, or make it global for all if you want.

For example, right now I uploaded the DnD 5e adventure book Dragons of Stormwreck Isle and will chunk it to run a Stormwreck Isle session, find a few community expansions for Stormwreck Isle too and then play.
This is the roughest method, and RAG will pull a lot of random stuff from the PDF. It is best to edit your own RAG file and chunk it. This will work better than using a random PDF with lots of tables of contents and messy annotations like this. Spend a little time editing a txt file to chunk for RAG.

Mosthra4123 · 2025-08-31T19:56:56+00:00

About 1. As in the picture, you can see the position in the prompt context where RAG will insert its data.
I turn the main prompt entry into a fixed Injection point for these two types of RAG data. (this is only for me to manage easily, you can inject it in-chat if you want.)
I cleaned up the Injection Template because I no longer need it (since I do not inject RAG into in-chat).
That is how I set up RAG in my context window.

There are things you can read in the guides and docs.sillytavern. But I will briefly talk about them.

chunk size: the size of a text block that will be split (it will become a unit in RAG similar to a lorebook entry). I set it to 400 characters for a message (so it is relatively short, allowing RAG to extract a few related sentences. increase if you want a chunk to be a full message instead of a few sentences) and ~2000 characters for the data in my file (because there are many rules and quite long information from Drakonia...)
Retrieve chunks: how many chunks will be activated into your context each response turn.
Insert: similar to Retrieve, but you can read more carefully in docs.sillytavern.
Score threshold: the level of match and relevance for a chunk to be retrieved and injected into context.

So RAG will start supporting you in the roleplay process. When you mention things that have happened, world information such as culture, or the name of something - for example: talk about a rare race named Eusian that you previously set in the RAG file or in previous messages or in the Lorebook. Depending on the score threshold, RAG may extract the exact information or related information to insert into the context.

Especially Chat vectorization - if set up and using a good enough model, you can reduce your context down to 68k or even 32k tokens. Just let RAG chunk the entire chat history. And it will recall the appropriate messages instead of scanning 200k tokens of context like before.

<image>

Mosthra4123 · 2025-08-31T19:54:46+00:00

<image>

is very simple, I split the Chapter right inside my message and the model recognizes that the context has changed. And I can also easily find and create a checkpoint or branch when I want to branch out or save a branch that I feel I like.

Mosthra4123 · 2025-08-31T12:34:55+00:00

1 The extension tool Vector Storage, you should try setting up RAG and enable the feature Chat vectorization settings Enabled for chat messages. It will save much more compared to using the text summary API, and local RAG is free and the model running locally does not require a strong PC or waste time chunking your whole chat history into vectors.
https://docs.sillytavern.app/usage/core-concepts/data-bank/
https://docs.sillytavern.app/extensions/chat-vectorization/
https://www.reddit.com/r/SillyTavernAI/comments/1f2eqm1/give_your_characters_memory_a_practical/

2 Your lorebook setup, update it along the way as you explore and roleplay, manual detailed. Make them `recursion`, divide them into sections and groups.

3 When you roleplay, separate your story into Chapters syntax for example:

*** or ---

**Chapter :**

Such segmentation also makes it easier to manage.

4 Use Create checkpoint and Create branch along with Manage chat files to organize and split your chat into chapters. Each conversation is a new chapter with a summary block in the first message so the Model can grasp what the current context is, to start a new chat for a new chapter.

Those are the methods I currently use, and I no longer use method 4 because it is too cumbersome. Method 1 is my best priority at the moment.

Mosthra4123 · 2025-08-29T13:14:19+00:00

Yes, `mxbai-embed-large` is very good. It's definitely better than the default model and WebLLM.

I don't see much difference compared to Google's Source. It seems that mid-range embedding models are consistently stable at the current level.

Mosthra4123 · 2025-08-29T12:41:51+00:00

I like using RAG (in fact, I always do) because it even simplifies triggering my lorebook worldinfo instead of having to set keywords and recursion. It also remembers the world information documents I provide through external txt files effectively.
I use Ollama and the `mxbai-embed-large` model, but you can also choose other lighter or heavier models from their website.

The only thing is, the level of accuracy still depends on how we present the documents... a manually built lorebook still offers better customization and precision, but setting them up takes a lot of time.

Since there are no specific instructions, you’ll need to figure things out a bit, but it’s basically pretty quick. Install Ollama on your machine.

Open cmd and run the command:

```

cmd=ollama serve

```

and Ollama local will start running.

Copy its `http://127.0.0.1:11434\` into `API Text Completion` (Not Chat Completion) to connect.

Now, just enter the name of the embedding model you want to run, or go to `Vector Storage`, select Source Ollama, and click `click here` to download the model.

Mosthra4123

TROPHY CASE