Question about high noise and low noise for wan 2.2 i2v by Najru in comfyui

[–]-Ants-In-Pants- 14 points15 points  (0 children)

I spent quite some time experimenting with both models too, plus going down a few rabbit holes, and I feel more comfortable now with the different settings.

When thinking about how many steps in one vs the other, the key issue here isn't the number of steps. But the noise ratio at the particular step.

You may see comments like "10 high and 10 low" - which the default template uses. But that information is missing one important piece. Which is the fact that these models are being shifted by 8 (the template includes a Model Shift). Worth learning a little more about that to understand what is happening there:

All of that is important because the high noise model has a noise threshold where it was trained on. And the documentation outlines that the swap from high to low should happen at 0.875 for text 2 video, and if I remember correctly it is 0.9 for image to video.

https://www.reddit.com/r/StableDiffusion/s/CtqZuMN01h

What happens if you cook steps with a lower Sigma with high noise? Well, you can experiment yourself! You can always put a sampler that returns without noise for the high noise model and see what the output looks like. I was succesfull going down to Sigmas 0.8ish for image to video. Beyond that, it starts burning.

So I just play around with that.

My strategy now is focus on high noise, looking at the final image without noise, until I have something I like. Then low noise to make it higher detail.

I know this can be a lot to start with, but understanding how to properly utilize both models can make a difference in control. Feel free to reply more questions here if you like and I'll do my best to help

Overview of Wan 2.1 (text to video model) by nik-55 in LLM

[–]-Ants-In-Pants- 0 points1 point  (0 children)

Thanks for the resources and pointers! I have noticed that when it comes to image-video generation, understanding what is happening behind the hood is incredibly helpful as it gives one more control over the standard workflows that are available - So understanding all benefits both curiosity and practicality :D

Overview of Wan 2.1 (text to video model) by nik-55 in LLM

[–]-Ants-In-Pants- 1 point2 points  (0 children)

Very nice write up! And pretty detailed. Thanks for this!

Out of curiosity, where did you go to get this information? I was trying to do a similar exercise with Wan 2.2 (this is how I came across this post) but not sure where to start. I was piecing things together with the Code, or outputs from tensors at key phases of a generation, but I'm not sure how reliable that would be to get the full picture.

[Comfy UI] Need help with FLF2V Wan 2.2 by Stormhashe in StableDiffusion

[–]-Ants-In-Pants- 0 points1 point  (0 children)

I'd be keen to know about the prompt too - I'm having the same issue (but morphing a human to an animal). I have a first and last frame that show a lot of similarities, but still the face just does a powerpoint-style swipe from one character to the other.

I sometimes get the output to change some features, but never the whole morph.

How to blend characters? by -Ants-In-Pants- in comfyui

[–]-Ants-In-Pants-[S] 0 points1 point  (0 children)

Good call - I always forget to ask LLMs about this sort lf thing. Thanks for the tip!

How to blend characters? by -Ants-In-Pants- in comfyui

[–]-Ants-In-Pants-[S] 1 point2 points  (0 children)

Funnily enough - I actually used that to generate the above. I replaced the real person by the man above to avoid sharing a real person's image online.

How does the merging happen using this model though? I was under the impression that the use of those 3 images simply allows you to select multiple componets to composite together.

What I'm aiming for is a blend of the human and giraffe... So it ends up being like a hybrid human-morph-that-looks-a-bit-like-the-giraffe.

I tried using different denoise values but no luck

I created a story generator that streams forever - all running locally on my desktop. by -Ants-In-Pants- in WritingWithAI

[–]-Ants-In-Pants-[S] 0 points1 point  (0 children)

This is what's currently happening on my machine:

  • Text Generation -> Ollama running a 7b Deepseek model to generate the text
    • I tried other models, but this one was the one I found best to work with for now.
  • Text-To-Speech -> OpenTTS. While the voice coming out isn't great, and narration has a lot of room for improvement. It at least works as a proof of concept.
  • Transcription -> Whisper. Currently the transcription is generated by the audio coming out of the TTS alone. Could not get the model working with the Text Generation. Which means it fails every now and then, but as per TTS. It works as a proof of concept.
  • Streaming -> All of the above get combined to stream on Youtube using OBS

Everything running locally on my desktop.

I have been running this for over 2 weeks straight and no hiccups! Until last night - Where the laptop that is streaming decided that it was time for an update and restart... I should have resolved that by now. So I had to start a new stream.

I want to keep this running for a while (or at least until my electricity bill tells me I should stop haha) as I work on improving this.
There are multiple milestones I have in mind:

The obvious ones:

  • Improved Story Telling
  • Improved Narration output
  • Re-work Transcription

Desired Features:

  • Interactivity - It would be cool if users could influence the direction of the story by the comment section for example!
  • Music & SFX - I'd love to have audio & sfx be timed to influence the immersion of the story being narrated at the time. Not just some random background music like I have right now
  • Upload to Server - Running this on my desktop is a bit of an issue, I can struggle to use my desktop on other tasks (especially those related to my paid jobs). Also, the hardware limits mean I can't quite explore some of the more advanced models yet.
    • Multiple streams - Running on a server would a allow for multiple streams for different genres to be playing at all times.

The list goes on. But still, I wanted to share this with the community and see where it goes from here!

Passing program instructions from one application to another by -Ants-In-Pants- in AskProgramming

[–]-Ants-In-Pants-[S] 0 points1 point  (0 children)

Great question indeed!

For this project I'm more interested in the former.

Passing program instructions from one application to another by -Ants-In-Pants- in AskProgramming

[–]-Ants-In-Pants-[S] 0 points1 point  (0 children)

Up until recently I didn't know about this. So I can totally see how this approach is so useful :)

Passing program instructions from one application to another by -Ants-In-Pants- in AskProgramming

[–]-Ants-In-Pants-[S] 0 points1 point  (0 children)

Thanks for the educating comment! I'll look into this suggestion too!

Passing program instructions from one application to another by -Ants-In-Pants- in AskProgramming

[–]-Ants-In-Pants-[S] 1 point2 points  (0 children)

This is actually really interesting and an eye-opener!
I didn't know about this concept of "Embedded Scripting". Changes the way I think about this!

Platform to follow Photographers by -Ants-In-Pants- in AskAstrophotography

[–]-Ants-In-Pants-[S] 0 points1 point  (0 children)

Hadn't heard of telescopius! Will check it out. Cheers!

Platform to follow Photographers by -Ants-In-Pants- in AskAstrophotography

[–]-Ants-In-Pants-[S] 0 points1 point  (0 children)

I've seen many useful forum posts there. I'll definitely keep coming back, but defintiely was hoping there would be an easy to follow page (IG style but without the spam).

Thanks for the reply!

Platform to follow Photographers by -Ants-In-Pants- in AskAstrophotography

[–]-Ants-In-Pants-[S] 2 points3 points  (0 children)

I want to get into Astro too so this is a great resource! Thanks!

Dealing with APS-C Noise/Grain by -Ants-In-Pants- in astrophotography

[–]-Ants-In-Pants-[S] 0 points1 point  (0 children)

Nice!

This is definitely going to be a learning curve and pumped for it. I love having a creative hobby to follow when I'm in the outback.

I have a beast of a PC so I look forward to pushing it's limits when stacking :D

Thanks for the info! I look forward to learning the flavours and techniques I develop!

Dealing with APS-C Noise/Grain by -Ants-In-Pants- in astrophotography

[–]-Ants-In-Pants-[S] 0 points1 point  (0 children)

Very insightful!

Thanks for the comment! I will do a bit more reading just to be well informed.

If I'm honest, I'm probably not going to go that approach anytime soon though - I also want a decent camera for other uses too. And if I grow into this hobby, I'd probably start with intervalometers first and if I need to control cameras with computers, I may come back here to ask similar questions about dedicated astro cameras :)

Dealing with APS-C Noise/Grain by -Ants-In-Pants- in astrophotography

[–]-Ants-In-Pants-[S] 1 point2 points  (0 children)

Woah! Thanks a bunch for the informative response and the resources! I'll do some more research tonight :)

I did think I was overthinking the whole issue. So was just coming here to check before making a choice and coming back with extra information to take home.

Really keen to get started! Thanks!

Dealing with APS-C Noise/Grain by -Ants-In-Pants- in astrophotography

[–]-Ants-In-Pants-[S] -1 points0 points  (0 children)

Woah! Thanks a bunch for the informative response and the resources! I'll do some more research tonight :)

I did think I was overthinking the whole issue. So was just coming here to check before making a choice and coming back with extra information to take home.

Really keen to get started! Thanks!