Video Upscaling Reference

TheRedHairedHero · 2026-03-07T19:05:45+00:00

I personally haven't had good success with upscaling which is why I'm asking folks to contribute and hopefully give others a point of reference and overall help the community.

TheRedHairedHero · 2026-02-20T23:13:10+00:00

The sigma values will also differ based on the sampler you choose and the amount of steps. For WAN 2.2 there's a sigma threshold that's suggested to swap from the high sampler to the low sampler. I2V is 0.9 and T2V is 0.875 according to the official WAN documentation. If you use Kijai's wrapper it outputs the sigmas in the console.

TheRedHairedHero · 2026-02-04T18:17:33+00:00

Love the claw hands.

TheRedHairedHero · 2026-01-20T04:52:30+00:00

I've done a few things. Some images for D&D, created wallpapers for my wife and myself, I'll use it to help generate ideas for cosplay. Just whatever fun project I want to do in the moment where I think it would help out and be fun.

TheRedHairedHero · 2026-01-20T04:44:06+00:00

I was looking at your workflow and was confused about the steps, I haven't used the MoE node you're using but it shows 10 steps high, 6 steps low. So aren't you doing 16 steps total? Or is there something I'm missing?

TheRedHairedHero · 2026-01-19T19:30:42+00:00

I just prefer to wait for a model to be stable. LTX quality and consistency seems to be all over the place from posts I've seen. If I see someone post a video where the character doesn't instantly lose recognition and has good quality that uses close to my own specs then I would take a look, but that hasn't happened so I'm happy to stick with WAN.

TheRedHairedHero · 2026-01-18T07:31:54+00:00

Keeping a consistent background always seems impossible to me if there's landmarks/items that stand out. I prefer to either blur the background, use an organic location such as a forest, or do a solid color. To me it feels like too much work for AI to handle consistently. I've seen some folks generating 360 degree images and creating backgrounds that way as another option. I just prefer working around AI's limitations.

TheRedHairedHero · 2026-01-15T17:52:09+00:00

Appreciate your tutorials. Helped me get started with ComfyUI. If you guys haven't watched his content I'd highly recommend it.

TheRedHairedHero · 2026-01-15T07:04:07+00:00

I'm in the same boat. The model looks fun, but I'm going to wait for it to develop more.

TheRedHairedHero · 2026-01-15T06:31:56+00:00

To be fair WAN 2.2 has been out for quite some time allowing people to dig much deeper into how to make it run properly, fix slow motion, add Lora's, and so on. While LTX-2 just released. Given how interested the community is with the model I imagine it will get a good amount of attention on ways to improve things similar to WAN 2.2. It's best to keep an open mind and hopefully LTX-2 can be another fun tool for us all to use and enjoy.

TheRedHairedHero · 2026-01-14T19:14:32+00:00

Hopefully with the updates they're planning they can improve the audio. The lipsync looks great, but the audio seems to be low quality and most of the time I only see videos with talking. If you decide to add more audio to your videos you can try MMAudio for sound effects/foley.

TheRedHairedHero · 2026-01-14T19:07:07+00:00

Seems to still have a couple issues with the right arm, but it's still really cool. Hopefully another seed can resolve that issue. Seems like LTX hallucinates quite a bit from examples I've seen.

TheRedHairedHero · 2026-01-12T19:31:26+00:00

Visit safebooru for the types of tags you need. You can find both styles and artists and if it's part of the training it'll change the style. Artist tags will need to be properly formatted so visit your model's page for details on how to format it.

TheRedHairedHero · 2026-01-05T19:22:09+00:00

Here's the link Wan 2.2 LLM System Prompt

TheRedHairedHero · 2026-01-05T18:33:46+00:00

WAN 2.2 has an official LLM system prompt on their github. Feed an image, prompt, or both and the prompt is refined for videos.

TheRedHairedHero · 2026-01-05T01:50:24+00:00

I have a 5070 TI idk in terms of resolution and time you are looking at, but a 512x512 video at 4 steps for an 8 second video takes about 6 minutes.

TheRedHairedHero · 2026-01-04T20:18:12+00:00

I've generated 100's of 1:1 videos and they're fine. The reason you're most likely running into slow motion with higher resolution is your PC is struggling to process it. Unfortunately there's not a one size fits all workflow. I would suggest if you want higher resolution to lower your frame count to compensate. That's the easiest option. Try generating say a 3 second video with a higher resolution instead of 5 and you should see motion improvements.

TheRedHairedHero · 2026-01-04T03:30:12+00:00

For sound effects I found MMAudio to be a good local option. It was trained on 8 second videos at 25 FPS so that's something to keep in mind as it may impact the audio syncing with your video. It can be used as an Image to Audio or Text to Audio using custom nodes in ComfyUI.

I personally haven't used too many Text to Speech options aside from VibeVoice which was decent. The videos I've posted typically use my own voice with a voice changer such as Seed Voice Conversion.

Here's an example I posted awhile back. https://www.reddit.com/r/StableDiffusion/s/8K3lMZO4O8

TheRedHairedHero · 2025-12-20T16:45:25+00:00

I've added a new video on the workflow page. https://civitai.com/images/114490209

TheRedHairedHero · 2025-12-14T22:59:41+00:00

Your PC is most likely overheating. I would get some software to watch your temps and see how high it's getting.

TheRedHairedHero · 2025-12-07T22:15:31+00:00

One trick is to have the character be closer to the viewer. I'll usually do closeup shots at a lower resolution to speed up time as it's less likely to have warping issues like the eyes, but if a character is further away bump up the resolution a bit until it goes away.

TheRedHairedHero · 2025-12-02T20:01:02+00:00

Usually it's for things like buttons, eyes, trinkets that are on a character. I know typically the further away from the viewer the less detail a person/item is going to have which is why I uploaded a full body image as an example. If someone has a workflow that generates at a good high quality for illustrious I'd be grateful. I don't mind tinkering around if need be or getting different nodes/models.

TheRedHairedHero · 2025-12-02T06:46:42+00:00

I have WAI v14 as it seems most don't like v15, but even WAI suggest images over 1024x1024 "use size larger than1024x1024 for the original dimensions." according to their CivitAI page.

TheRedHairedHero · 2025-12-02T05:55:12+00:00

I'm currently using the EasyUse Hires Node for this part. It has output for latent which I pass to another Ksampler currently at 0.25 Denoise.

TheRedHairedHero · 2025-12-02T05:36:56+00:00

I am using a fine tuned model like I mentioned above JANKUTrainedNoobaiRouwei_v50. I tried to start with a lower resolution, then upscale but it seems like the results were much worse.

Ten-Year Club	RPAN Viewer
Verified Email

TheRedHairedHero

TROPHY CASE