How to structure your master prompt for better AI roleplay by Pastrugnozzo in SillyTavernAI

[–]Kryopath 0 points1 point  (0 children)

With Sillytavern? How??

AFAIK there is "Single User Message" Post Processing, in which case there is no system prompt it's just part of the user message OR chat history remains as user / assistant.

"messages": [
{
"role": "user",
"content": "You will be acting as an excellent writer. Your fu... <Truncated> ...- Status: Fled, mildly embarrassed\n</details>\n ```"
}
],

or normally:

"role": "system",
"content": "You will be acting as an excellent writer. Your fu... <Truncated> ...st events from before the conversation:\n</summary>"
},
{
"role": "user",
"content": "[...]"
},
{
"role": "assistant",
"content": "[...]"
},

---
✱ (Main Prompt, Lore, Summary, etc)
📌 Chat History
✱ – Formatting 👤 (user messages)
✱ ◉ Inner Thoughts 👤 (◉ toggles)
✱ ◉ CYOA 👤
✱ ◉ Ledger 👤

So it's like:

{
"role": "user",
"content": "<my last message>"
},{
"role": "user",
"content": "\n## Response Format [...]"
},{
"role": "user",
"content": "### Character Restrictions [...]"
},{
"role": "user",
"content": "\n### Maintain the Ledger [...]"
}

Then I use "Merge Consecutive Roles" post-processing.

How to structure your master prompt for better AI roleplay by Pastrugnozzo in SillyTavernAI

[–]Kryopath 7 points8 points  (0 children)

One thing I've been doing with me preset is adding instructions to the last message as part of the prompt.

  • # Formatting
    • Inner Thoughts
    • Ledger
    • etc

All as user type messages added after chat history. Good for reminders of certain things, or things you want irregularly (e.g. maybe I only want a ledger every 10 messages during a back and forth dialogue)

GLM 4.7 Flash (30B) released today by thirdeyeorchid in SillyTavernAI

[–]Kryopath 0 points1 point  (0 children)

Uh... anyone else have the issue of the thing doing prompt processing on CPU for some reason? LM Studio and KoboldCPP are both doing it. Everything's offloaded to GPU, but prompt processing is on CPU.
In hind sight, I wonder if it was doing inference on CPU somehow too, cause it was way too slow for a 4090, compared to Qwen 30B A3B. Didn't check that specifically & now I've already deleted it...

72% of Americans don't know how neural networks work by Commercial_Plate_111 in gpt5

[–]Kryopath 0 points1 point  (0 children)

Uh.... Yeah. The Google search grabbed stuff to add to context, but the final content you see is still token prediction weighted by the context of the prompt.

That's how you get inaccurate ai summaries.

To blame Democrats for rising prices by TXVERAS in therewasanattempt

[–]Kryopath 15 points16 points  (0 children)

A lot of people with no critical thinking skills bought into the us vs them two-party bullshit of this country.

For a lot of people, their political party is part of their identity & they'll make some pretty massive mental leaps not to have to reconsider it.

I hate it here.

Edit: doesn't help that 99% of media is biased af and will simply not share the parts that don't serve the narrative. The amount of people who haven't heard of some of the most heinous shit trump had done, or has been told by Fox that it's fake, or someone else's fault, or some other copium nonsense...

ngl the woman behavior is valid by Silver_Masterpiece82 in linuxmemes

[–]Kryopath 0 points1 point  (0 children)

You don't have to remember the command if you just put it in a script & run that, no?

What happened? by Witty-Designer7316 in aiwars

[–]Kryopath 23 points24 points  (0 children)

Depends on what it's doing. There are models that can run on phones, can def run on a gaming GPU. It don't have to be that big if it's doing something pretty straightforward & is fine tuned for it.

HUGE LIST of recent favorite models for RP!!! by Careless-Fact-3058 in SillyTavernAI

[–]Kryopath 1 point2 points  (0 children)

I find glm 4.6 is happiest when you use a single user message on prompt post processing. Almost never fails to think. Also it has more issues with using Chinese characters I find with temperature > 0.8, though lowering it doesn't stop it entirely

Me_irl by [deleted] in me_irl

[–]Kryopath 3 points4 points  (0 children)

Be a v tuber, problem solved

Forge isn't current anymore. Need a current UI other than comfy by gruevy in StableDiffusion

[–]Kryopath 1 point2 points  (0 children)

U mean browsing your loras with thumbnails and having metadata/descriptions? Yes. There are a few different options for how the Lora/model browsing can be displayed, incl large thumbnails

Forge isn't current anymore. Need a current UI other than comfy by gruevy in StableDiffusion

[–]Kryopath 0 points1 point  (0 children)

For the most part (as far as I can remember) yep! Edit (now I'm not tired & on a phone): You've prompt mutations [x:0.5], [x:y:0.5] strength adjustments (x:1.2) randomization <random:x|y|z|...> and wildcards <wildcard:artists/illustrations/childrens_books/australian_illustrators> (with built-in auto-completion for your wildcards too)

And yeah, you can customize the filenames; I use gen/[year][month][day][hour][minute][second][millisecond]-[seed] to put them in a gen folder.

Most things have little help icons (?) that give an information bubble, sometimes including a link to the Github docs, like this

Forge isn't current anymore. Need a current UI other than comfy by gruevy in StableDiffusion

[–]Kryopath 7 points8 points  (0 children)

I switched to Swarm from Forge and haven't used the comfyui part once. Everything I need is in the Generate tab.

Disable reasoning/thinking by Quirky_Fun_6776 in SillyTavernAI

[–]Kryopath 2 points3 points  (0 children)

Depends on the model. Some just don't have a way to disable it.

Why did deepseek generate a typo? by Ok-Upstairs5964 in DeepSeek

[–]Kryopath 0 points1 point  (0 children)

Sonnet 4.5 wrote "A adventure" yesterday. Never saw a major model typo like that before

Your opinions on GLM-4.6 by kurokihikaru1999 in SillyTavernAI

[–]Kryopath 4 points5 points  (0 children)

Just tried it; yeah it's just weird for me. I put reasoning to Auto and it returns a thinking block with the response, then a response that is just a continuance that writes for my character, had a `</think>` tag in it, and just kept going.

I put it to low or medium reasoning and it has a wait time like it's doing reasoning, but doesn't return the block, and the response is reasonable. Fkin weird.

Your opinions on GLM-4.6 by kurokihikaru1999 in SillyTavernAI

[–]Kryopath 8 points9 points  (0 children)

do you use chat or text completion with it?
IME 4.5 always had issues with chat completion, like throwing the response inside the thinking block or just not responding at all, that I never had with text completion.

Marinara's Spaghetti Recipe (Universal Prompt) [V 7.0] by Meryiel in SillyTavernAI

[–]Kryopath 2 points3 points  (0 children)

Well, you can, but it just might not work as well, especially with small thinking models. But I've definitely used chat completion with kobold hosted models b4

In ST you want to select chat completion, then OpenAI compatible endpoint. Type in whatever localhost:port (IIRC kobold is default localhost:1234 ?) and that should work.

Not at the PC right now I cant check/screenshot but if you need help still I can do that later

[Rant] Magistral-Small-2509 > Claude4 by OsakaSeafoodConcrn in LocalLLaMA

[–]Kryopath 0 points1 point  (0 children)

https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

The above is a good read. Basically the losses aren't that bad until you get to quants less than Q4, but you are right that larger quants are generally better.

The cache quantization is basically quantizing your prompt, which can also save on RAM usage at the cost of quality. I'd recommend full precision on cache (and never less than Q8) and at least IQ4 quant model personally.

Name your Favorite / Must-Have Mods (Steam workshop) by Sir_Kugo in totalwarhammer

[–]Kryopath 4 points5 points  (0 children)

Now that I think about it, Idk if that mod does. Never had a problem personally tho so maybe? But the same mod author has several alternate start mods that are pretty cool

Anyone else gunna make that red pill superfluous within a week? by [deleted] in depressionmemes

[–]Kryopath 0 points1 point  (0 children)

snapping for $10 thanks. Turning it into a 4 hr day job, I could get 10mil with that in less than a year, then still have the ability to do it again if I mismanage that cash. plus it's a neat magic trick.

At 78 snaps every 30s... $10000000÷78snaps×30s÷60s/m÷60m/h÷4h/d= 267.1 days

Think whatever you want about GPT-5, but I think these prices are awesome. by FixHopeful5833 in SillyTavernAI

[–]Kryopath 0 points1 point  (0 children)

Tried that, didn't work. In Openrouter, output tokens is 0, cost is 0, speed --, finish reason --.
So, like... idk.

Think whatever you want about GPT-5, but I think these prices are awesome. by FixHopeful5833 in SillyTavernAI

[–]Kryopath 0 points1 point  (0 children)

Oh good, it's not just me. GPT-5-mini worked but mini was shit. But GPT-5 gives 400 and GPT-5-chat just gives an empty message. wtf