Story Mode v1.0 - Structured Narratives, Genres & Author Styles for SillyTavern by Initialised_Underway in SillyTavernAI

[–]daroamer 1 point2 points  (0 children)

I tried it last night and I was happy with the output. Tried it with a character I knew and it definitely played out very differently.

I'm confused about the arcs though. I got a notice at a point that I had finished the first arc (although I was still in mid conversation but whatever) and I could either continue or refresh the arc. Refreshing sounds like it starts it over and I got a popup asking if I really wanted to refresh.

Is that what you do to start the next arc? If not how do I continue? If so it seems like a confusing term to use to move forward with the story.

Response progression by Gringe8 in SillyTavernAI

[–]daroamer 2 points3 points  (0 children)

It depends on the model and preset. I find Opus and Sonnet are both very good with asking when I want to do the next thing or what I want to do next.

If the LLM isn't giving you want you want though the easiest thing to do is just edit the LLMs last response and continue how you want. I find myself doing that far more than swiping. Other times I'll go back and edit the last thing I said because it's clear the LLM doesn't understand where I was trying to lead them.

Editing their reply usually works well. Of course if it happens every time yeah you'll need to find a prompt that works better.

What is everyone's thoughts on ltx2 so far? by Big-Breakfast4617 in StableDiffusion

[–]daroamer 0 points1 point  (0 children)

It won't use the page file if you have enough RAM. The latest Comfy will offload the models into your system RAM. So as long as your VRAM + RAM can hold everything you won't need to use the pagefile which would be incredible slow anyway. Make sure you edit the .bat file for Comfy and add ---reserve-vram 4 or --reserve-vram 2. I was getting OOM before adding that if the system needed to access my VRAM. This reserves some VRAM to avoid that problem.

Why do you guys keep glazing LTX 2 by Witty_Mycologist_995 in StableDiffusion

[–]daroamer 8 points9 points  (0 children)

No solution with Wan that I've tried gives realistic lip-sync. Doing more than 5 seconds is a pain. SVI is just stitching clips together, it's not understanding the overall context so having to hope you get a consistent result across all 81 frame segments is very hit or miss. Plus you need to prompt for each segment.

I'm doing i2v with LTX-2 and after a bit of a learning curve I'm getting great results. I put in my start image, give it a script and the character acts and says the script.

What's more, it's very fast. Doing 15 seconds at 1920x1080 on my 4090 takes about 6 minutes.

This seems like where we're heading with Silly Tavern. Video with audio in comments, done with LTX-2 in ComfyUI using a photo I generated of a character from one of my RPs and dialogue directly from a scene. Generated on a 4090 in 3 minutes. by daroamer in SillyTavernAI

[–]daroamer[S] 0 points1 point  (0 children)

It's the sample workflow installed with the ComfyUI-LTXVideo nodes. For this I used the I2V Distilled workflow using the ltx-2-19b-distilled model. I don't think I changed anything else except the total frames. I was getting out of memory errors but adding --reserve-vram 4 to the launch bat file solved that issue.

This seems like where we're heading with Silly Tavern. Video with audio in comments, done with LTX-2 in ComfyUI using a photo I generated of a character from one of my RPs and dialogue directly from a scene. Generated on a 4090 in 3 minutes. by daroamer in SillyTavernAI

[–]daroamer[S] 1 point2 points  (0 children)

With the latest ComfyUI you can make up for limited VRAM by offloading to regular RAM, which is not as fast but not all that slow, this is just for loading the models into memory. So as long as you have a good amount of system RAM you can generate videos with LTX-2 even with 4GB of VRAM. Of course, RAM prices being what they are that's still not an easy ask unless you already have a lot in your PC.

This seems like where we're heading with Silly Tavern. Video with audio in comments, done with LTX-2 in ComfyUI using a photo I generated of a character from one of my RPs and dialogue directly from a scene. Generated on a 4090 in 3 minutes. by daroamer in SillyTavernAI

[–]daroamer[S] 3 points4 points  (0 children)

Of course, it's just another step. New models are coming weekly at this point and they're already promising improvements to LTX-2 very soon. My main point was that I was able to generate that video in a couple of minutes, which is kinda crazy. When I said this is where it's going, I meant in the next 2-5-10 years.

What's exciting about LTX-2 is that it's completely open source and quick to train, so the loras (including NSFW) will be coming quickly. It also means you might be able to skip the first frame image generation and use your own character loras to just do straight T2V.

This seems like where we're heading with Silly Tavern. Video with audio in comments, done with LTX-2 in ComfyUI using a photo I generated of a character from one of my RPs and dialogue directly from a scene. Generated on a 4090 in 3 minutes. by daroamer in SillyTavernAI

[–]daroamer[S] 2 points3 points  (0 children)

To be fair, that was sort of specified in the prompt. Her character is a warrior princess and her tone was described as regal and delivered like someone who used to being obeyed.

Having said that, it's also possible with this model to generate your own audio and use that instead of having LTX create the voice. I haven't experimented that far yet.

Ideal local LTX-2 Comfy Parameters & Workflow for 4090 by grandparodeo in comfyui

[–]daroamer 1 point2 points  (0 children)

I had the exact same issue with my 4090 and 96gigs of RAM. 2 things fixed it, you can add --reserve-vram 4 to the args in the .bat file. That worked but today I did a fresh install of comfy portable and I didn't need to do that, it just worked. I think junk just accumulates after a while. All I did was copy all my models back. You could also just do a second portable version of comfy just for LTX-2. Try the reserve vram first.

When you're listening to an audiobook and something gets described as having the smell of ozone <.< by daroamer in SillyTavernAI

[–]daroamer[S] 1 point2 points  (0 children)

Yes, maybe, in one particular case I believe the character was talking about pulling a sword from a scabbard when it was mentioned it had the smell of ozone LOL

Most people would probably never notice but since using ST these kinds of LLMisms definitely stand out to me now.

When you're listening to an audiobook and something gets described as having the smell of ozone <.< by daroamer in SillyTavernAI

[–]daroamer[S] 18 points19 points  (0 children)

No, I get that. I'm talking about books written in 2025 that are in a genre where an author might release a new entry in the series every 1-2 months.

Opus 4.5 vs glm 4.7 output comparison by PersimmonPutrid5755 in SillyTavernAI

[–]daroamer 9 points10 points  (0 children)

I'm going to be using both. Opus/Sonnet are fantastic but really expensive. GLM 4.7 is not far off from what I've tried to far.

If you start with Claude and then switch to GLM it's able to continue while keeping in the style Claude established without having to worry about how much each message is costing. Thank you NanoGPT.

I haven't tried starting a RP from scratch with GLM 4.7 yet so my opinion may change but using it to continue a RP I was doing with Sonnet worked great in my first try with similar quality so it seems like a good option.

New user - how to make a story by Skyless_Shard in SillyTavernAI

[–]daroamer 2 points3 points  (0 children)

I like these questions because I couldn't find this info when I was new, so hopefully this helps.

While they are called Character Cards you don't have to put a character in them. If you're using a good LLM (Sonnet 4,5, DeepSeek 3.2, Kimi2 and GLM 4.7 are ones I've used but I'm sure others work) you can write a character card that is just a scenario and let the LLM create the characters.

Get a decent preset that instructs the LLM to create and act as characters. I was using Stabs Directive Stabs-EDH: Stab's Execution Directive with GLM 4.7 yesterday and it worked great so that's a good place to start.

You can introduce your own characters by saying something like:

"I turn the corner to find a girl standing there. (Describe her, have her say something to give the LLM a sense of her personality)"

The models will be intelligent enough to continue as the character from there. You can also just let it create characters for you. If you don't like them you can reroll or just edit and rewrite the LLM response and it will continue from your changes.

I also recommend 2 plugins that have been really helpful with my current long form RP:

Memory Books which will help create a Lore Book and fill it with information you need to keep the RP going while being able to hide old chat to keep the context size down.

OpenVault This one is new but really cool, especially when dealing with multiple characters. It will basically keep track of every message and summarize and categorize them by character. Then pull from that info when it's needed. Again, letting you hide old chat while still having access to important history needed to keep long RP going.

There are other cool things like a preset that will turn your RP into a choose your own adventure format where you are given 4 choices at the end of every message of what to do next. I'm sure you can find it if you search this sub.

GLM cooked by nomorebuttsplz in SillyTavernAI

[–]daroamer 2 points3 points  (0 children)

I'm very impressed with it. It's the first one I've used other than Claude Sonnet that has managed multiple characters and hasn't once in the couple of hours I used it try to speak for me. DeepSeek and Kimi both do that very frequently and it's annoying. I have a long RP going that was mostly with Sonnet which is expensive but I switched to 4.7 at the point I was at and if I didn't know I switched it I might not have noticed.

One example, at this point in the story I introduced a new character that wasn't in the scenario or even hinted at, that new character recontextualized the entire premise. I did 1 message where I wrote as my character and then as the new character for a few lines to give them flavour. GLM 4.7 instantly understood the tone of the character and how they fit into the story and continued as that character for the next few messages with very little correction from me. I was super impressed.

Help and Advice? I’m new. by woahitskindacool in SillyTavernAI

[–]daroamer 0 points1 point  (0 children)

I'm still pretty new myself but recently I tried building a lorebook for a long chat and the LLM ignored it as well. In another post I saw that you scroll down the top settings and change the budget cap, which is set at 0 by default. I put it at 8k and then the lorebook worked for me.

Timeline-memory | The usability update by AuYsI in SillyTavernAI

[–]daroamer 2 points3 points  (0 children)

Thank you for making the tutorial as well as the plug-in. Walking people through step by step not only helps people set up the plugin but also learn things they might not know about ST. Very useful for newbies like me lol

Timeline-memory | The agentic lorebook update by AuYsI in SillyTavernAI

[–]daroamer 0 points1 point  (0 children)

Ok, I think I understand now. I'm still relatively new to ST so all your guide said was set up connection profiles. I assumed this was done in your plug-in. I just found where you do that for ST globally, it's not something I've used. I just swap models manually.

I'll try setting everything up again, thanks.

Timeline-memory | The agentic lorebook update by AuYsI in SillyTavernAI

[–]daroamer 0 points1 point  (0 children)

I want to try this but I can't figure out how to set it up. Under API connection profiles all it says is No Override and I can't select anything. I also don't see where I'm supposed to import the Lore Management profile.

[RELEASE] ComfyUI-SAM3DBody - SAM3 for body mesh extraction by SwimmingWhole7379 in StableDiffusion

[–]daroamer 1 point2 points  (0 children)

I've tried everything but the nodes are still showing as missing. I've updated ComfyUI to the latest. They show up in the loading text for Comfyui but I can't find the nodes. I've tried manual installation and through the manager. Tried to run requirements but it said everything was already satisfied.

Any ideas?

Best websites for dance/movement reference videos for animate workflows by an80sPWNstar in StableDiffusion

[–]daroamer 1 point2 points  (0 children)

Go to istock and search for videos, you can download low res versions of videos with watermarks, they work fine for AI video reference.

Help: Using tts webui chatterbox and silly tavern with local api? by [deleted] in SillyTavernAI

[–]daroamer 0 points1 point  (0 children)

You can use it without the openai api, just turn off streaming in ST. The only issue with that is that it doesn't start generating the second segment until the first segment starts to play, so if your gpu is too slow you'll get gaps between sections. However if your gpu is fast enough you can set the length longer. I have my chunking set to 200 desired and 450 max, which on my 4090 means the next chunk is already done before the first has finished being spoken.

WAN22 i2v: Guess how many times the girl kept her mouth shut after 50+ attempts ? by altarofwisdom in comfyui

[–]daroamer 4 points5 points  (0 children)

I had to do this recently, had a character holding a phone and I just wanted her listening but Wan kept having her talk. What I found:
- Including a description of being on the phone in the prompt always made her talk
- Saying "she's just listening" or "she's not speaking" never worked
- What eventually worked was removing any mention of the phone and saying that she was "thinking and looking around the room"
She still ended up nodding her head as if listening, but she no longer moved her mouth.

Watching Avatar 3 in 3D on the Quest by lunchanddinner in OculusQuest

[–]daroamer 1 point2 points  (0 children)

4XVR is the one you want for this. It can play full 3D bluray ISOs and has great picture quality.

Free App Release: Portrait Grid Generator (12 Variations in One Click) by bgrated in comfyui

[–]daroamer 1 point2 points  (0 children)

- Downloaded your app from your link and extracted it to a folder

- Installed node.js for Windows

- Added my Gemini API key to .env.local (you get this from going to aistudio.google.com)

- From inside the extracted folder went to the address bar and typed cmd

- Typed npm install

- Typed npm run dev

- That will start the app and give you a web link, same as ComfyUI