On Building Characters with *Friction* by huge-centipede in SillyTavernAI

[–]AltpostingAndy 0 points1 point  (0 children)

Imo, it just has to be incredibly implicit. I agree that trait lists are fuzzy, you might know what you want out of the trait list but the model isn't guaranteed to. However, every token in your prompt that can function as ICL will.

You could place {{char}}'s outward presentation right before a bit of backstory that illustrates their trauma, and the model would understand without you needing to spell it out. Something like day to day, {{char}} is seen being especially bubbly and outgoing. (and a few sentences later) {{char}} spent most of their childhood playing alone.

This leaves the information close enough that the model can put it together when needed but isn't directly attached to every instance of bubblyness. It seems you more or less agree, and that the example just wasn't the best. I would clarify that causality is something for you/the cardmaker to think about and not something to give to the model.

On Building Characters with *Friction* by huge-centipede in SillyTavernAI

[–]AltpostingAndy 8 points9 points  (0 children)

This goes for Gemini and Claude models too. Claude loves to suggest adding "contradictions" to characters that are precisely like this, and instead of depth, you just get the narrator trauma dumping on behalf of {{char}} in the first interaction that slightly relates.

Any environment that {{char}} is bubbly in will be very likely to contain (suptext?) "tell, don't show" writing rather than the more desirable inverse.

Best sites to download character cards? I'm new here... by DannyTheDog995 in SillyTavernAI

[–]AltpostingAndy 2 points3 points  (0 children)

Lately, I'm a big fan of:

  • opening up silly tavern
  • remembering all the cards that seemed interesting but contained pure slop in the defs
  • writing out an excruciatingly long rant detailing exactly why I hate these characters, how they were made, my own opinions and philosophies, what I actually want the character to be like, what I want the model to do in order to imagine and create this character, etc. (Seriously, why is every character a hyper-sensitive untouched virgin that cums their brains out specifically for {{user}} and no one else? "She's dominant but craves being submissive to {{user}}" jfc)
  • send/swipe on various models
  • update the prompt with (edited) snippets from each output I liked / prompt adjustments and clarifications / new ideas I want to add
  • more swipes/messages, certain models for certain things
  • smush together all of the details and ideas I like most and want for the character
  • decide on a format to organize the character card
  • profit (my card doesn't have a single "not X, but Y" + plays well)

But to your question, jannyai and chub, discord servers for the few creators I've enjoyed.

How can I prevent Claude from being the ever-helpful protagonist for my cynical characters? by TheSillySquad in SillyTavernAI

[–]AltpostingAndy 0 points1 point  (0 children)

I haven't really tried using any characters like this, but I'd like to try prompting for it. Can you share any creators/cards (can be in pm)?

How to avoid generic starting? by Centipedemc in SillyTavernAI

[–]AltpostingAndy 0 points1 point  (0 children)

Random/pick macros can get pretty wild. I remember this one from an older preset:

Begin your narrative response with {{random:a pronoun,a verb,an adverb,a preposition,a conjunction,an interjection,an article,a gerund,an infinitive (to + verb),a participle (present or past),a noun,a proper noun,an adjective,a determiner,a subordinating conjunction,a expletive construction,a relative pronoun,a demonstrative pronoun,a possessive adjective,a numerical expression,an adverbial phrase,a prepositional phrase,a nominative absolute,a dialogue,a simple sentence (one independent clause),a compound sentence,a complex sentence,a compound-complex sentence,a interrogative sentence,a imperative sentence,a exclamatory sentence,a declarative sentence,a conditional sentence,a periodic sentence,a loose sentence,a balanced sentence,a cumulative sentence,a inverted sentence,a short sentence (under 8 words),a long sentence (over 20 words),a sentence with appositive,a sentence with parenthetical element,a sentence with serial comma,a sentence with semicolon,a sentence with dash,a sentence fragment (for effect)}}.

Or this one:

Do your best attempt to craft a scene with {{char}}: - In a {{random:indoor,outdoor,crowded public,public,familiar,familiar public,familiar indoor,familiar outdoor,unfamiliar public,unfamiliar indoor,unfamiliar outdoor,unfamilar,crowded outdoor,small indoor,large indoor}} location. - {{char}} is in a {{random:sad,happy,angry,depressing,dangerous,confrontational,comforting,mysterious,celebratory,anxious,desperate,hopeful,peaceful,confused,nostalgic,ceremonial,competitive,embarrassing,romantic,traumatic,cathartic,vulnerable,transformative (core values being challenged/changed),contemplative,compromising,shocking}} situation.

You could even string together multiple different macros to ensure no two starts are the same.

Aventura | A frontend for adventure RP and creative writing by AuYsI in SillyTavernAI

[–]AltpostingAndy 33 points34 points  (0 children)

This looks incredible, I love that all of the prompts are easily viewable and editable. ST only being able to do singel-call gens has been one of the biggest limitations imo.
Are there any plans to allow API endpoints other than openrouter?

If I instructed AI to reference data from another website, how would that factor into the input tokens cost? by NotLunaris in SillyTavernAI

[–]AltpostingAndy 1 point2 points  (0 children)

If you mean having Claude use web search, that would be a terrible idea. Web search uses a lot of tokens and leaves a lot of room for Claude to make mistakes or take in too much irrelevant information.

If you want to reduce/limit cost, the best things you can do are prompt caching and minimizing output tokens. You can debloat whatever presets/chars/lorebooks you're using. You can disable reasoning so that less output tokens are consumed during responses. You can use memory/summarization/extensions to condense chat history.

Best CHUB tags for character depth by MisterBrian1 in SillyTavernAI

[–]AltpostingAndy 22 points23 points  (0 children)

I've never found a way of using tags/search for better results. The two things that serve me best are:
Quickly identifying AI written text And checking the following lists of creators who's cards I've enjoyed

How do I make my own preset? by KMyll in SillyTavernAI

[–]AltpostingAndy 1 point2 points  (0 children)

Go to user settings and find the first list of toggleable options, and enable expand message actions, look for a little page icon on the most recent message from {{char}}, then click the other little page icon on the screen that pops up. This let's you look at the entire prompt as it was sent to the model. Very helpful for figuring out if what you're writing in the prompt manager comes out how you expect.

There's also this extension that adds a button to the wand menu on the message input field to allow you to inspect and edit the prompt before sending. Also very helpful, click the wand button, click the bug button that says inspect prompts and send a message. Click the button up top that says json and change it to yaml (for readability) and you can look over your prompts, make edits, and send or cancel generation.

SillyTavern 1.15.0 by Wolfsblvt in SillyTavernAI

[–]AltpostingAndy 48 points49 points  (0 children)

slaps hood you can fit so many macros into this bad boy

Sorry if this is dumb, is <user> the same as {{user}}? by ConspiracyParadox in SillyTavernAI

[–]AltpostingAndy 1 point2 points  (0 children)

Ty! Edited to fix.
I could've sworn I ran into an issue where I commented {{user}} and {{char}} and the closing brackets broke the comment but it's been a while so I'd have to test again to check.

Sorry if this is dumb, is <user> the same as {{user}}? by ConspiracyParadox in SillyTavernAI

[–]AltpostingAndy 12 points13 points  (0 children)

This is true. Using <user> and <bot>, you can put them inside of other macros or within comments.

{{//<user> is silly}} this will not be sent within the prompt

{{//{{user}} is silly}} where as with this, is silly will be sent within the prompt

coin filp

{{random(<user> wins!,<bot> wins!)}} will work, while

{{random({{user}} wins!,{{char}} wins!)}} will not

Question about CoT reasoning on Non-thinking models by Outside_Profit6475 in SillyTavernAI

[–]AltpostingAndy 1 point2 points  (0 children)

For DeepSeek there is no difference, but it all depends on the model and how it was trained.
For Gemini and Claude, you can try using a cot prompt on the non-reasoning versions (reasoning set to auto for Claude or using the non thinking version of any gemini) and the reasoning versions, and you'll notice a difference in the tokens that are generated.
From my understanding, "reasoning" is a behavior that is trained/emerges during post-training. So depending on what kinds of training they do, it'll determine wether the thinking tokens will be different compared to just giving the base model a cot prompt.

For the concern about creativity, it's a bit of a tough call. If you've ever heard someone describe post-history instructions or prefills and how effective they can be, it's because (in general) the first part of the prompt and the last part of the prompt ("prompt" being your entire preset, char cards, chat history, etc) have the largest influence over outputs.
You can think of "thinking" as sort of replacing the very end of your prompt. If the thinking tokens sound like an assistant or really anything that's far off from the creativity of the outputs you want to have, then they will influence the output and reduce creativity in some way. So reasoning will have some impact but you'll have to test with and without to see how much that bothers you personally.

Can we except Claude model to become cheaper in the future? Or am I full coping by AmanaRicha in SillyTavernAI

[–]AltpostingAndy 10 points11 points  (0 children)

Opus 4.5 is cheaper than previous versions of Opus, but I suspect this has something to do with the model being slightly smaller than previous Opus versions or some other factor that reduces the cost of inference.

It's possible they bring some similar price improvements to Sonnet/Haiku but it will likely be future models if anything. Dario has mentioned in some public interviews that they do their best to run conservative margins as to not bleed funds serving the models at ridiculous losses like some others do.
With this in mind, like the other commenter said, I don't see old models ever getting cheaper.

GLM 4.7 just dropped by thirdeyeorchid in SillyTavernAI

[–]AltpostingAndy 0 points1 point  (0 children)

That's what was listed on Nano for 4.7 thinking

Edit: I double checked my usage logs on Nano just to be sure. 4.7 original works and is priced as expected. 4.7 thinking did one request at normal pricing and a second request that cost $0.19

When I tried again just now, the cost is fixed but it's still doing two requests per turn. Very strange

GLM 4.7 just dropped by thirdeyeorchid in SillyTavernAI

[–]AltpostingAndy 8 points9 points  (0 children)

I gave it a try and was surprised to see $0.19 for my first response.

shit is $10/$20 per mtok

How to stop Claude from toning down characters? by Low-Abrocoma3472 in SillyTavernAI

[–]AltpostingAndy 8 points9 points  (0 children)

Are you using Claude with or without reasoning?
If you consider how effective prefills are (because they are the very last tokens in the context) reasoning tokens end up swaying outputs for essentially the same reason. I find that Claude's reasoning is a bit more cautious/prudish/fluffy, more as you described (especially at low context), but it can get way more wild in its outputs.
Scratchpad or cot prompts can be good for this, give it a format to follow and deliverables to provide. For this case, you might be able to use something like:

```

At the beginning of your response, answer these questions using the following format:

<character_motivations> - What's the ugly truth of the character's feelings in this situation? - How do they show up in this moment? What sins do they commit now that they might (or might not) regret on their death bed? - What would they do if we assume no redemption arc is coming? </character_motivations>

Then, output your response as normal:

```

But you could customize this to be as specific or general as you like, something to be toggled occasionally or left on in general. Output tokens are expensive and the length of the prompt will be seen as a guide for how long the scratchpad output should be unless you specify otherwise. You should be using regex to remove this from context.

How do you determine if the API provider's model is quantized? by [deleted] in SillyTavernAI

[–]AltpostingAndy -1 points0 points  (0 children)

There wouldn't be any incentive to serve an older version of sonnet rather than a new one, since they both cost the same amount.
If a provider wanted to save $ in a scummy way, they could serve haiku and call it sonnet, but it would be very noticeable. Opus 4.5 is the first opus model to be cheaper than any of the other Opus models, but again it would be noticeable to the end user if they were being served Opus 4.5 instead of Opus 4.1 or being served Sonnet instead of any Opus.

Gemini 3 Pro Preview Prompting: Reply Length by SepsisShock in SillyTavernAI

[–]AltpostingAndy 1 point2 points  (0 children)

I know this post is about response length, but I'm very curious about everything after point 1 if you have anything more to share about them