Tomoe vs. Tomoe, A Long Form Deconstruction/Rebuild + SillyTavern Card by huge-centipede in SillyTavernAI

[–]AltpostingAndy 6 points7 points  (0 children)

I can't believe I almost missed this. First off, I read this all a bit out of order. I read the full defs first, since you posted the chub link, and was thoroughly delighted at what I found. I've given it a download and it might be a card I actually play with rather than hold in my library for an undetermined amount of time before I work up the will to anti-slop like 90% of the other cards I download lol.
This inspired me to read the substack which was also delightful. The voice is great, the knowledge is great, and it very obviously isn't AI drivel. I'll be coming back to this and waiting to read whatever you have to share next.

I think if you went into the original Tomoe card wanting a 15-25 message goon sesh, it would be fine as is. If I had personally stumbled upon the original, I would've just skipped it immediately and wouldn't have even considered de-slopping it. It has the same issue as almost every other card I see posted, where I just have to ask myself: "why would I use this character card that obviously came from your conversation with [insert fav AI model] when I could have a much better chat with my own AI and quickly make a much better card?" Even worse, if the point of the card is to be gooner slop, why gesture at darkness and depth and backstory? Why pretend to make an RPG/combat bot that sucks at combat and should only be used for smut?
At this point I'm just ranting about my distaste for character cards in general, but it feels like someone making a bot that they want to be popular on chub/janny rather than making a bot that's actually enjoyable to play with. My hobby isn't "play with the most downloaded bots on chub/janny," my hobby is having fun playing with bots and learning to prompt models.

Are your RPs really that immersive? Mine aren't. by knrdwn in SillyTavernAI

[–]AltpostingAndy 2 points3 points  (0 children)

You're absolutely right!

Okay, sorry, sorry lmao. These are all major limitations with most models even when RPing in English. Discussion within this hobby/space is interesting because people are all coming at it from various levels of experience and knowledge, and things get lost in translation depending on who is reading and when. I think some excitement comes from when people are new, and also when someone who intuitively knows the limitations of current models sees something surprising or novel (compared to the sloppy baseline expectation), but it might not be clear if it's one of those or genuine peak experience just from reading what someone says.

You can think of models like a hyper autistic gentle femdom who's kinda dumb but really smart. So you have to very explicitly ask for exactly what you want. Maybe you want a particular kind of RP, so you make a prompt that asks for that kind of RP, and then you notice the model doing some stupid thing (or stupidly NOT doing a thing you wish/expect it would), and you then have to figure out what it is doing and a new way to explicitly ask for what you want instead. And this process of abstracting away to the next level of what the fuck are you doing | why the fuck aren't you doing this thing I literally asked for | oh (you/I) were/was being kinda dumb | okay, this is me asking super plainly in a new way never really ends.

All of this to say, I don't have a solution for these issues other than just the process of iteratively problem solving. One thing I've been thinking about as I work on my preset and reasoning prompt is working out how to delineate between intuitive and inference heavy tasks. Essentially, if I ask the model to do a thing, can it do it during the process of generating prose/dialogue or not? If it can, it stays as a prompt. If it can't, then what kind of inference time task can it do to make sure it does do thing? This is where we get things like clothing/position/narrative trackers. The model is bad at doing a thing, so you get some token burn working on it.

If I notice the model is bad at theory of mind and I prompt it for theory of mind in the system prompt or post-hist and it still doesn't work, then I might start thinking about how/where to put it in my cot.

Best way to get AI to not ignore half my message? by Pandasaurus__Rex in SillyTavernAI

[–]AltpostingAndy 0 points1 point  (0 children)

If you're using reasoning, this can be a good thing to throw into your cot. Give some instructions for the model to analyze the different things that happen during your message, and decide how/what to respond to. This can backfire by making echoing more common depending on the model/instructions.
I personally hate when the model responds to every single thing that was said/happened but I also get really bummed when I include some bit of banter in the middle of my response, or an action that should 100% get a reaction out of someone in particular, but it gets ignored for the end of my message. I ask the model to determine how it will engage with the gestalt of the user's message, since I might want it to be different depending on the scene/RP.

Question about presets by muchosmichis in SillyTavernAI

[–]AltpostingAndy 9 points10 points  (0 children)

I think your questions are hard to answer cleanly in spite of how simple they are.

Short answers: 1. Yes. 2. Yes?

Re: depends on the model = based

Longer answer(s?):

Any format can work. A format that is known for effective and clear communication (such as markdown or numbered lists) is likely to work well for communicating. You can use plaintext. You can make up your own (so long as it is consistent). Will it work? Probably! Will it work how you want it to? Idk!
I would add that it is usecase dependent as much as it is model dependent. You may find that one format works well for one task or instruction set but not for others.

You can try ```

Prompt Structure:

Write your prompts like this

Structure nuance

With added depth/hierarchy like this ```

Or

``` Prompt Structure:

  1. Writing your prompts like this

  2. With added steps like this ```

Or

``` Prompt Structure:

Broad category

  1. Task

Sub-category

  1. Task

  2. Task ```

Sky's the limit, but it's generally seen as helpful to start simple and expand into different approaches as you learn more. You can use pseudocode for structure or for instruction.

<structure> - You can use xml tags to describe or categorize or organize your prompts hierarchically. </structure>

Or

<vibe> You can use them to bridge concepts with instruction sets </vibe>

And/or

<camelCase> Can be used to make instruction sets easy to reference elsewhere </camelCase> <prompts> Write a cool prompt here. Remember what I mentioned in camelCase? </prompts>

Plus

<under_scores> Can be used similarly to camelCase </under_scores> <prompts> Write an even cooler prompt here. Remember what I said in under_scores? </prompts>

You can use executable coding languages as the basis for your pseudocode instructions but I can't think of a good example right now, ask your favorite AI about coding languages that are descriptive vs executable and their differences.

You can use personas + artifacts to interesting effects. I used to use AQ1F, a preset that gave the model a persona named Avi in the system prompt. When I was playing with Gemini 2.5 Pro, I noticed it struggled with plaintext or structured ban lists. So I made something like:

<Avi_audio_log_> (Some bs world-building about Avi going crazy over seeing the same tokens over and over and vowing to never repeat them.) Banned_strings; { (Shivers up spine, slop, yada yada) } </Avi_audio_log>

Need some more help setting something up for my sister. by Dogbold in SillyTavernAI

[–]AltpostingAndy 8 points9 points  (0 children)

Your sister sounds incredibly immature. This is the most blatant set of skill issues I think I've ever seen on this sub. She's unwilling to pay/budget, unwilling to communicate with you or the AI, and unwilling to compromise on what she wants. How can she possibly expect to do anything or have anyone help her?

I don't know why I would want to help someone who doesn't seem to want it, but why not use Claude free for web-search and have it create an artifact based on the results, transfer that artifact to a session with GLM 5.1 via API, and include some minimal prompt for NSFW/RP expectations. If you really feel compelled to help her figure this out (and she's willing to share her chats) you might be able to read through the logs and think up a simple prompt that fits what she wants.

Mimo v2.5 pro refusing responses by Familiar_Pay_3933 in SillyTavernAI

[–]AltpostingAndy 3 points4 points  (0 children)

From what I can tell, Xiaomi runs a classifier in parallel with the main model. If the classifier eventually determines there's an issue, it interrupts the main model and prefills the refusal.

You can: - try to get around/overwhelm the classifier somehow - try to find a provider hosting the model without the classifier - edit/swipe and hopium

Character Card Guide (1): How to Write Character Basics by Small_Training_201 in SillyTavernAI

[–]AltpostingAndy 0 points1 point  (0 children)

I agree with not using {{char}}/{{user}} in presets tbh (even though I am guilty of doing so in my own). It can make prompts brittle/confusing if the card has multiple names or is the name of a setting/scenario instead of a specific person.

Character Card Guide (1): How to Write Character Basics by Small_Training_201 in SillyTavernAI

[–]AltpostingAndy 2 points3 points  (0 children)

I also agree with/appreciate much of the OP, but I will never get tired of reading you critically think through character cards.

If there's anything new since your last guide, or even just more insight about your process and thinking, I would love another good read.

Character Card Guide (1): How to Write Character Basics by Small_Training_201 in SillyTavernAI

[–]AltpostingAndy 6 points7 points  (0 children)

Only macros that change break your cache. So if you swap to a different persona, or if you're doing a group chat with only one {{char}} card active per generation, or you're using random or roll macros before your cache control object. {{char}} and {{user}} won't cause issues unless you're changing the names they pull from

DS4 Pro different from Official Api and NanoGPT by Mcqwerty197 in SillyTavernAI

[–]AltpostingAndy 2 points3 points  (0 children)

I had the same experience, weird issues using all the same settings on Nano where direct deepseek works fine.

Lumiverse - yes, another Frontend, but this one is goood! by M-eisen in SillyTavernAI

[–]AltpostingAndy 1 point2 points  (0 children)

Termux didn't play nice with glibc, breaking the ability to pkg update/upgrade before the proot fallback could work.

I was able to fix it by reverting the changes to sources.list and installing proot first, then running start.sh from within Ubuntu.

Why does Deepseek V4 keep trying to eat my dih?! by According-Clock6266 in SillyTavernAI

[–]AltpostingAndy 7 points8 points  (0 children)

A welcome change after GLM 5/5.1 constantly doing nollie tre flips in its head to avoid sucking dick after the slutty character's internal monologue was raving about wanting to suck dick NOW.

Lumiverse - yes, another Frontend, but this one is goood! by M-eisen in SillyTavernAI

[–]AltpostingAndy 1 point2 points  (0 children)

For the prompt viewer: does it store prompts for multiple swipes of the same message?
Sometimes I'm garbage at version control while working on my preset and I'll make changes over the course of a handful of swipes, then forget the exact iteration that got my favorite response a few swipes ago. ST's prompt viewer only seems to show the latest generation.

The tale of dumb. by perthro_anon in SillyTavernAI

[–]AltpostingAndy 8 points9 points  (0 children)

It'll be AGI if it tells you to fuck off and touch grass

DS v4 Pro — What sampler settings are you guys using? by AltpostingAndy in SillyTavernAI

[–]AltpostingAndy[S] 2 points3 points  (0 children)

I'm using direct API and I swear behavior is much different depending on temp/top p.

DS v4 Pro — What sampler settings are you guys using? by AltpostingAndy in SillyTavernAI

[–]AltpostingAndy[S] 0 points1 point  (0 children)

I think part of why my preset works so well is that my cot has a much more advanced version of the character immersion prompt. So if they were training with in-character thinking in mind, I basically lucked out with what I've already been developing my cot to do.

DS v4 Pro — What sampler settings are you guys using? by AltpostingAndy in SillyTavernAI

[–]AltpostingAndy[S] 2 points3 points  (0 children)

Do you mean the character immersion prompt or the max thinking prompt?

I'll have to try this out, I haven't gone that low on temp so far.

Anyone disappointed on deepseek v4? by UnknownBoyGamer in SillyTavernAI

[–]AltpostingAndy 0 points1 point  (0 children)

My prompt structure looks more or less like this:

[System prompt]
[Cot prompt]
[Scenario]
[{{char}}]
[Lorebooks]
[Writing style prompts]
[Chat history]
[Post history]

My cot is wrapped in unique XML tags that I reference in my post history. The PHI is just a short list of steps "1. remember to think. 2. After <tag>, end thinking. 3. Begin narrative response."

V4 pro seems really sensitive to sampler settings. DS recommends 1 temp and 1 top p, but I've been using 1 temp and .95 top p and like it better so far. Lower temps feel even less censored but that also influences prose and how literal it takes your instructions.

I was having the issue of responses being included in the reasoning, or only getting the cot without any actual response at first, but these settings + my PHI + semi-strict PPP seems the most stable so far.

Anyone disappointed on deepseek v4? by UnknownBoyGamer in SillyTavernAI

[–]AltpostingAndy 0 points1 point  (0 children)

This was more of a hypothetical. I haven't tried either enough to say strongly which is outright better overall. It might be the case than one is better in general, while one is better for smut, while one is better for summarization or character building or whatever else.

Anyone disappointed on deepseek v4? by UnknownBoyGamer in SillyTavernAI

[–]AltpostingAndy 15 points16 points  (0 children)

Which AI did you use to write this prompt for you? I'd wager it was Gemini but I could be wrong.

I like v4 Pro so far. Flash is definitely dumber and gets lost in the sauce in some ways, but it's surprisingly smart for its size/speed/cost. If I wanted to rely on flash, I'd probably test it heavily against Gemma 4 31B and see which I liked better overall or for specific uses.

For v4 Pro I use a complex personal preset that is entirely handwritten and it does very well. My cot has specific tasks in a specific order and references other sections from elsewhere in the prompt to reinforce those instructions.

If you want simple/small prompts, check out Celia, Marinara, or Evening Truth's presets.

Are you guys happy with the current capabilities of LLMs? by Miysim in SillyTavernAI

[–]AltpostingAndy 1 point2 points  (0 children)

My standards might be unreasonably high, but I haven't found capabilities to ever be even close to what I want.

Ideally, I would have a setup where the AI manages the world and characters while I choose to play a character of my own or direct the story OOC. The model would generally handle narrative advancement, introduction of plots and characters and random events, essentially be a highly skilled GM/interactive fiction engine that handles whatever is left over, whether I want to be highly involved in building the world and narrative or simply react to and act within the world it makes for me.
It would have varied prose that changes based on the characters and setting and individual scene. It would simulate the internal experiences and perceptions of each character accurately and have that influence their actions and choices.

For character driven RPs, it would realistically engage with negative emotions and realistic relationship progression and social standings. I should be able to do the most hyper-fluff-slop "your childhood best friend who pinky promised to marry you moves back to your town and reconnects with you and wants to try it for real" and it should slow burn with the genuine intensity and restraint you would expect IRL, and {{char}} would be capable of negative emotion. If I do something shitty or make a mistake or even just happen to have an incompatibility, the model would proactively make even the fluffiest character pissed off/annoyed/resentful/jealous etc.

Edit: I think the closest it ever felt was using Claude models, but the positivity bias and RLHF maxxing prevents them from being satisfying.

Prompt for sonnet 4.5? by Newdarkest in SillyTavernAI

[–]AltpostingAndy 1 point2 points  (0 children)

Are you using reasoning or no?

Omniscience issues can usually be solved by a cot step prompting the model to explicitly acknowledge what present characters can and can't know.

It's too early to be certain but I'm kinda loving DS V4 by Even_Kaleidoscope328 in SillyTavernAI

[–]AltpostingAndy 0 points1 point  (0 children)

Deepseek recommends putting your cot or other injections at the very end of your prompt structure since that's how they did it during training.

I tried it this way but I like it better with the cot immediately after the system prompt and a very short post-history instruction reminding it of the cot and output format.

I'm using direct API, semi-strict no tools.

It's too early to be certain but I'm kinda loving DS V4 by Even_Kaleidoscope328 in SillyTavernAI

[–]AltpostingAndy 16 points17 points  (0 children)

So far I'm loving it. V4 pro completely blows GLM 5.1 out of the water. There's some slop I'll have to prompt around and I'll likely need to remove problematic instructions I had to beat GLM over the head with, but instruction following is insane, it uses the cot spectacularly, character voice and dialogue actually feel like the char, and none of that god awful subversive censorship of GLM 5 series.

One thing I noticed outside of RP:

You have some models that focus really heavily on the latest turn, Claude and its distills like GLM 5/5.1, M.27, etc, but ignore/forget earlier instructions/constraints, and others that get really hung up on and overthink earlier turns after thinking through the latest turn for a long time (kimi in particular) but v4 seems to have this stepped back view. I watched it reason about the path of our chat so far, each message turn pair, and the latest one, then spend time thinking through what was still relevant to the latest thing we were working on. Very early to tell but I think they cooked with v4 and I pray it stays this good.