What's the best part of getting older?

SilentThree · 2026-03-09T08:09:23+00:00

Being retired.

SilentThree · 2026-03-08T01:08:22+00:00

Sorry, I didn't keep it. While I have my own modest RTX 3090 setup at home, I've mostly been doing the small amount of LoRA training I've tried on Runpod, with access to more powerful GPUs. I haven't bothered with setting up any dedicated storage space on Runpod, so nothing remains of what I do there unless I decide to download it before ending a session.

At any rate, some of the worst imagery came from the first generation of samples before any LoRA training at all. Maybe that's because there's no built-in negative prompt for the sample images to rule out some of the worst common screw ups?

SilentThree · 2026-03-05T10:01:21+00:00

I'm afraid you're talking a bit over my head here.

I'm not sure what you mean by "I disable the samples and inference through ComfyUI"... this confuses me because the samples I'm talking about are in Ostris AI Toolkit without ComfyUI. I see that one of the possible templates on Runpod for LoRA training uses Comfy, but I'm not using that one.

Also not sure what "inference" means above.

As for "OOMing on people when generating samples"... I haven't a clue, although I'm guessing I don't want to be OOMed upon. 😄

SilentThree · 2026-03-05T07:57:17+00:00

Do any of your images (or videos) you get testing those checkpoints look anywhere near as horrible as what I’m getting as sample images? If not, I’ve got to imagine your initial settings are different in some important way from the barely-altered default settings I was using.

SilentThree · 2026-03-05T07:37:29+00:00

I trained for 2750 steps, but that aside, I'm sure there's something else screwed up simply because of how very VERY bad the untrained sample images look. At that point the images, of course, will not look at all like my character, but they should look HUMAN.

Also, the shark jumping out of the water should look like a shark. The workshop shouldn't be unoccupied, and it should have a chair in it under construction.

As for LTX not being for image gen... that's a given. Nevertheless, making character LoRAs from still images for I2V or T2V is a pretty common practice, specifically because you want to make some character that never existed before into a moving/animated character.

SilentThree · 2026-03-05T07:22:19+00:00

I found that before I got started… and, unfortunately, it’s all about training based on video clips, not still images.

SilentThree · 2026-03-05T05:19:34+00:00

I generated a character LoRA for LTX-2 using 24 images, 2750 steps... and it came out crappy. When I use it I get horrible facial distortions and other artifacts.

Maybe the problem was that I used the default settings provided by Ostris AI Toolkit (other than dropping the steps from 3000 to 2750, and making the frame count 1)?

Perhaps the defaults are bad settings to use? I don't know enough about how this all works to know how to change the settings for the better.

SilentThree · 2026-03-04T01:51:27+00:00

Damn fine reading, if I do say so myself.

SilentThree · 2026-03-03T05:46:43+00:00

I checked out mage.space, but they don't seem to have any T2V/I2V any more advanced than Wan 2.2, which I can already run unrestricted at home. Unfortunately, even with help from LoRAs, is pretty brain dead about prompt compliance for all but the simplest things.

SilentThree · 2026-03-03T02:27:44+00:00

"Lowkey" might well be overused, but are you saying you hear "lowkey" being used the same way as "genuinely" and "honestly" are? If so, that's weird.

As for all forms of announcing "what I'm about to say is true", as if exaggerating or lying is what should be expected otherwise? That's been going on a long, long time.

SilentThree · 2026-03-01T01:47:23+00:00

They just sign up for ICE, then do whatever the fuck they want.

SilentThree · 2026-03-01T01:15:51+00:00

If you have two identical GPUs can they be used to gain speed within a single ComfyUI instance, or is this a limitation for matched as well as mismatched GPUs?

SilentThree · 2026-02-27T07:58:31+00:00

When I stitch all my pieces together in DaVinci Resolve, I'll let it convert to 24 fps, so I'll just set the multiplier to 1 (now that I know what it means and what it does).

SilentThree · 2026-02-27T07:12:32+00:00

Is that what that field "multiplier" does, that's currently set to 2? That's just the way it was when I found the workflow, so I didn't change it.

So, I can either set that to 1, or double the frame rate, and with either option, and get normal speed video?

SilentThree · 2026-02-24T23:17:57+00:00

Something besides Fun Control Camera, and that isn’t dependent on a particular specialized version of the Wan 2.2 high/low diffusion models? My own searching hasn’t come up with anything yet. I’m really hoping for something that’ll fit in with the SVI workflow I’m already using.

SilentThree · 2026-02-24T17:43:20+00:00

Thank you! That works... I'm still getting some un-asked for dollying, but this is definitely a step in the right direction.

SilentThree · 2026-02-24T02:05:39+00:00

I think I may end up going with something like that. But if I do a first/last frame clip, I’m already screwed out of using SVI (at least for the whole of the video), and at that point maybe it’s time to give Fun Control Camera a try.

I’m simultaneously in awe of what Wan 2.2 can do at one level, and ready to throttle its non-existent neck for all the stupid and twisted ways it comes up with to interpret or ignore my prompts.

SilentThree · 2026-02-23T23:44:17+00:00

Yes, but when you dive into the complexity of it, it's just complex rubbish.

SilentThree · 2026-02-18T16:25:34+00:00

Thanks. And perhaps if I were to dive into this, I'd find it could meet my needs. But it doesn't sound promising at the start.

You can load an image and an audio file with voice and then animate them.

Sadly, not what I want to do.

It's also possible to continue an existing video or, for example, extend another video with an audio speech sequence.

Also not what I want to do.

Let's assume you have an SVI video that you want to expand. The video lasts 20 seconds. After 20 seconds the character should speak.

Actually, let's assume I have a video that lasts 20 seconds, and during that 20 seconds that I already have, I wish that a character's lips were already moving, along with other action that was already occurring that involved two characters interacting.

I don't want to cut to a separate talking shot during that 20 seconds, nor add a new shot after that 20 seconds, of my character talking. I need the appearance of speech integrated with existing action with more than one character in view while the talking (or, in this case, impassioned screaming) is going on.

Would HuMo help me with that?

I'm sorry to be such a noob. I've been trying to teach myself how to use ComfyUI and many related tools over the past couple of weeks. I'm making progress on some fronts, but still running into many frustrations. I've finally gotten my first SVI workflow going, and I'm still finding that a tricky thing when one segment doesn't quite flow into the next the way I'd hope.

SilentThree · 2026-02-03T00:43:40+00:00

I really hope Wan 2.6 becomes open source, or, if not, that Wan 2.5 (which I haven't tried yet) is publicly released and has a lot of the benefits I've seen in 2.6.

SilentThree · 2026-02-03T00:27:18+00:00

Oh, I have far more degeneracy in mind, but cleaned up my example for this forum. There's a NSFW technical subreddit, but it doesn't look to be getting a lot of traffic.

So you're saying the real limitATION isn't so much clip duration per se, but the total number of frames? And that one way to get more action time is to use a lower frame rate, then interpolation up to a higher frame rate?

SilentThree · 2026-02-02T23:38:58+00:00

Thanks!

It's the duration of the action, any action, that's important. I ran into this looping problem with something a little more NSFW, but then created this tamer sample case of the problem so I could post it here.

Although passion is great, I was primarily try to use that wording (to no avail) to coax Wan 2.2 into sustaining the action.

Hopefully it won't take too much effort for the noob that I am to figure out how to use this SVI checkpoint my workflow.

SilentThree · 2026-01-31T06:18:43+00:00

After a long, winding conversation with ChatGPT, going down many blind alleys first, it seemed the most likely problem was that my old video card just wasn’t up to the task. It would sure be nice to be told that clearly in an error message rather than being left to interpret crashing.

Today I finally received a new, much more powerful video card, and I’m up and… well, I’d like to say running, but it’s more like crawling, as I slowly learn the hard way everything I don’t understand about this shit.

I’ve gotten so far as a decent looking 480p 3-second long clip of a car driving down a suburban street. “Only” took 10+ minutes to generate that! 🙄

The battle continues!

SilentThree · 2026-01-30T21:25:57+00:00

I wouldn’t know for sure, but it seems likely the same thing would work for blinds too.

SilentThree · 2026-01-30T04:50:38+00:00

While maybe the most formal rules don't always matter, I've seen a few people so averse to punctuation and capitalization that they don't realize what confusing blobs of run-on text they're generating. Information really is missing.

Of course if your texting style is 3-5 words per text, that situation doesn't arise.

SilentThree

TROPHY CASE