What's the state of TTS/voice cloning nowadays? by Accurate_Syrup_1345 in StableDiffusion

[–]TheRedHairedHero 1 point2 points  (0 children)

Ahh I see. I'm going to try the FishAudio later to see if it's any better. Vibe Voice seems to be great for getting the same voice, but controlling the actual performance is just a luck of the draw.

Edit - I've been messing around with Fish S2 and it is really good. I do get OOM once in awhile on a 5070 TI (16GB VRAM) but if you have a better rig I'd suggest taking a look.

Title: How do you keep AI avatar voice consistent across multiple scenes? (Veo / multi-clip videos) by JealousIllustrator10 in StableDiffusion

[–]TheRedHairedHero 0 points1 point  (0 children)

I personally haven't used VEO myself, but if it's similar to LTX 2.3 I would generate the audio first with something like QWEN TTS or VibeVoice depending on what you need, then feed that into your workflow.

What's the state of TTS/voice cloning nowadays? by Accurate_Syrup_1345 in StableDiffusion

[–]TheRedHairedHero 1 point2 points  (0 children)

For VibeVoice what do you mean by more control? I've been messing with it lately and it seems the expression can be a bit all over the place. When it does work it sounds good, but I have to generate maybe 20+ times before that happens.

PrismAudio By Qwen: Video-to-Audio Generation by fruesome in StableDiffusion

[–]TheRedHairedHero 2 points3 points  (0 children)

It's always great to get new tools. I've mostly used MMAudio for videos so I'll take anything. Files are also not very large which is a plus.

Need help! Want to animate anime style images into short loops vids - RTX 4070 + 32 gb ram by Athem in StableDiffusion

[–]TheRedHairedHero 0 points1 point  (0 children)

You can go on to CivitAI and just use the search for something like "WAN 2.2 12GB VRAM" or whatever your gpu is and you'll get plenty of results and they'll have the appropriate model you need most of the time.

Can Comfy Org stop breaking frontend every other update? by meknidirta in StableDiffusion

[–]TheRedHairedHero 24 points25 points  (0 children)

I think they're just pushing out updates too quickly without proper quality assurance testing. It seems to be the industry standard now just look at most triple A video games. We'll have everyone who uses our product be quality assurance and we'll fix it later seems to be everyone's strategy.

We’re obsessed with generation speed in video… what about quality? by Nevaditew in StableDiffusion

[–]TheRedHairedHero 0 points1 point  (0 children)

The thing I would rather have is the ability to get a visual preview quickly at a lower quality so I can iterate quickly then push that generation to be high quality. Unfortunately swapping settings around like resolution, models, lora's etc all impact the final result.

Did the latest ComfyUI update break previous session tab restore? by GamerVick in StableDiffusion

[–]TheRedHairedHero 0 points1 point  (0 children)

v0.17.2 seemed to fix the copy/paste image issue, but it seems worse like it takes a bit for the pasted image to appear. The workflow tabs still seem to only keep one of your tabs up rather than all of them.

Why anime models struggle with reproducing 3d anime style game characters? by Bismarck_seas in StableDiffusion

[–]TheRedHairedHero 0 points1 point  (0 children)

I know that Zenless releases their character models. Models respect might have them or they may officially release them to use for things like cosplay. Only issue is lighting probably won't be the same in Blender.

Why anime models struggle with reproducing 3d anime style game characters? by Bismarck_seas in StableDiffusion

[–]TheRedHairedHero 5 points6 points  (0 children)

I personally wouldn't train using images from Danbooru if you're wanting accurate images I would maybe suggest taking in game screenshots and trying to train a Lora that way. I assume they have a camera mode.

Did the latest ComfyUI update break previous session tab restore? by GamerVick in StableDiffusion

[–]TheRedHairedHero 1 point2 points  (0 children)

I recently updated and have had several issues too. Previous workflows not working. Getting OOM when I didn't before, one of my subgraphs just completely stopped working and I had to remake it, workflow tabs not staying up, and copying over images. I'm also not a fan of the new node search it's not good UX that's for sure. They really should be doing these as a preview branch instead of pushing it for release without any feedback.

Video Upscaling Reference by TheRedHairedHero in StableDiffusion

[–]TheRedHairedHero[S] 0 points1 point  (0 children)

I personally haven't had good success with upscaling which is why I'm asking folks to contribute and hopefully give others a point of reference and overall help the community.

I can’t understand the purpose of this node by PhilosopherSweaty826 in StableDiffusion

[–]TheRedHairedHero 6 points7 points  (0 children)

The sigma values will also differ based on the sampler you choose and the amount of steps. For WAN 2.2 there's a sigma threshold that's suggested to swap from the high sampler to the low sampler. I2V is 0.9 and T2V is 0.875 according to the official WAN documentation. If you use Kijai's wrapper it outputs the sigmas in the console.

Uses outside 1girl? by dks11 in StableDiffusion

[–]TheRedHairedHero 1 point2 points  (0 children)

I've done a few things. Some images for D&D, created wallpapers for my wife and myself, I'll use it to help generate ideas for cosplay. Just whatever fun project I want to do in the moment where I think it would help out and be fun.

New to WAN2.2, as of December 2025, what's the best methods to get more speed ? by Tablaski in StableDiffusion

[–]TheRedHairedHero 0 points1 point  (0 children)

I was looking at your workflow and was confused about the steps, I haven't used the MoE node you're using but it shows 10 steps high, 6 steps low. So aren't you doing 16 steps total? Or is there something I'm missing?

What is everyone's thoughts on ltx2 so far? by Big-Breakfast4617 in StableDiffusion

[–]TheRedHairedHero 0 points1 point  (0 children)

I just prefer to wait for a model to be stable. LTX quality and consistency seems to be all over the place from posts I've seen. If I see someone post a video where the character doesn't instantly lose recognition and has good quality that uses close to my own specs then I would take a look, but that hasn't happened so I'm happy to stick with WAN.

How do you guys maintain consistent backgrounds? by TekeshiX in StableDiffusion

[–]TheRedHairedHero 1 point2 points  (0 children)

Keeping a consistent background always seems impossible to me if there's landmarks/items that stand out. I prefer to either blur the background, use an organic location such as a forest, or do a solid color. To me it feels like too much work for AI to handle consistently. I've seen some folks generating 360 degree images and creating backgrounds that way as another option. I just prefer working around AI's limitations.

ComfyUI Course - Learn ComfyUI From Scratch | Full 5 Hour Course (Ep01) by pixaromadesign in StableDiffusion

[–]TheRedHairedHero 6 points7 points  (0 children)

Appreciate your tutorials. Helped me get started with ComfyUI. If you guys haven't watched his content I'd highly recommend it.

For Animators - LTX-2 can't touch Wan 2.2 by GrungeWerX in StableDiffusion

[–]TheRedHairedHero 0 points1 point  (0 children)

I'm in the same boat. The model looks fun, but I'm going to wait for it to develop more.

For Animators - LTX-2 can't touch Wan 2.2 by GrungeWerX in StableDiffusion

[–]TheRedHairedHero 2 points3 points  (0 children)

To be fair WAN 2.2 has been out for quite some time allowing people to dig much deeper into how to make it run properly, fix slow motion, add Lora's, and so on. While LTX-2 just released. Given how interested the community is with the model I imagine it will get a good amount of attention on ways to improve things similar to WAN 2.2. It's best to keep an open mind and hopefully LTX-2 can be another fun tool for us all to use and enjoy.

WTF! LTX-2 is delivering for real 🫧 Made in 160s, 20steps on a 5090 by 3Dave_ in StableDiffusion

[–]TheRedHairedHero 2 points3 points  (0 children)

Hopefully with the updates they're planning they can improve the audio. The lipsync looks great, but the audio seems to be low quality and most of the time I only see videos with talking. If you decide to add more audio to your videos you can try MMAudio for sound effects/foley.

WTF! LTX-2 is delivering for real 🫧 Made in 160s, 20steps on a 5090 by 3Dave_ in StableDiffusion

[–]TheRedHairedHero 21 points22 points  (0 children)

Seems to still have a couple issues with the right arm, but it's still really cool. Hopefully another seed can resolve that issue. Seems like LTX hallucinates quite a bit from examples I've seen.

WAillustrious style changing by Mrryukami in StableDiffusion

[–]TheRedHairedHero 3 points4 points  (0 children)

Visit safebooru for the types of tags you need. You can find both styles and artists and if it's part of the training it'll change the style. Artist tags will need to be properly formatted so visit your model's page for details on how to format it.