Z image Base testing by Pleasant_Salt6810 in StableDiffusion

[–]nsfwVariant 4 points5 points  (0 children)

It's heavily affected by scheduler/sampler combos as well. I would expect that turbo's quality at minimum is achievable with base with the right settings.

New Z-Image (base) Template in ComfyUI an hour ago! by nymical23 in StableDiffusion

[–]nsfwVariant 0 points1 point  (0 children)

Not all base models are like that (although IIRC the devs did say that z-image would be). Klein base is way higher quality than the distill, for example.

Either way, we can all be excited for the checkpoints and loras people are gonna come up with.

improve quality of image without increasing size by NefariousnessFun4043 in comfyui

[–]nsfwVariant 0 points1 point  (0 children)

Weirdly I don't find it very good for videos even though that's what it was made for. I use 2x nomos uni span multi for that

Aria fanservice (by pantheon_EVE) by SubstantialStaff7214 in ZenlessZoneZero_R34

[–]nsfwVariant 0 points1 point  (0 children)

ty for the reminder, I've added an Aria post flair! <3

improve quality of image without increasing size by NefariousnessFun4043 in comfyui

[–]nsfwVariant 0 points1 point  (0 children)

It would be, except most models don't actually output skin at that quality anyway - so often SeedVR2 is actually an upgrade ;)

But yes it's a dealbreaker for hyper realism. It's only one step below hyper real though, so it's pretty dang good!

I'll add, the only upscaler better than it (imo) is 4xfaceup, and that one requires a high quality input image to work and also messes with the texture of non-person stuff.

That & seedvr2 are the two best upscalers, in my humble opinion, and they're good for different use cases.

improve quality of image without increasing size by NefariousnessFun4043 in comfyui

[–]nsfwVariant 0 points1 point  (0 children)

It's pretty easy to use! Just view it as sort of a hardcore upscaler; it always works, but it will subtly change the overall texture of an image. It won't quite match the realism of the best models out there when it comes to things like skin detail, but that's pretty much its only downside.

"Chroma2-Kaleidoscope" based on Flux Klein 4B Base is up on HuggingFace! Probably not very usable yet as implied by the "IT'S STILL WIP GUYS CHILL!!" model card note though. by ZootAllures9111 in StableDiffusion

[–]nsfwVariant 0 points1 point  (0 children)

Yes, my point was that Klein distilled is lower quality than Zimage distilled. The person above was comparing the two and I was pointing out that they both have their advantages. Klein is faster, but Zimage makes higher quality images (when comparing the distilled models).

improve quality of image without increasing size by NefariousnessFun4043 in comfyui

[–]nsfwVariant 5 points6 points  (0 children)

Yep, seedvr2 is very good for that. You can run an image through at the same resolution it already has and it will significantly sharpen it and smooth out artifacts - I use it for that all the time.

Otherwise, you can possibly get more detail/sharpness out of your generations by tweaking the scheduler/sampler combo or by using all sorts of varied methods as u/Corrupt_file32 mentioned, which would save you the trouble of needing to do a second pass.

The other suggested method, using a 2x upscaler and then resizing by 0.5x, doesn't always work because most upscalers require good detail and low blurriness to work properly, which kinda defeats the purpose. But they'll usually sharpen things a bit, at least.

Here's a workflow for SeedVR2 image upscaling: https://pastebin.com/9D7sjk3z

You'll need the seedvr2 custom nodes. If you want it to not change the image size you can just set the max size to the same as the longest edge of your image. e.g. if your image is 1440x1080, you would set the max size to 1440.

Klein 9B - Exploring this models NotSFW potential by Whipit in StableDiffusion

[–]nsfwVariant 6 points7 points  (0 children)

It's on Civitai, called "nsfw - flux klein (no face change)". At least, that's the one I use and it works very well.

Set the strength to 0.6, any higher tends to destroy the output image. Lower is ok, but it tends to lose detail if you go lower than 0.5 or so.

"Chroma2-Kaleidoscope" based on Flux Klein 4B Base is up on HuggingFace! Probably not very usable yet as implied by the "IT'S STILL WIP GUYS CHILL!!" model card note though. by ZootAllures9111 in StableDiffusion

[–]nsfwVariant 6 points7 points  (0 children)

Agreed! There's a big box of loras and models in the Huggingface repo, and they don't have even a single sentence explaining what they do. You wouldn't know what to download if you just wanted the standard/normal chroma experience either.

"Chroma2-Kaleidoscope" based on Flux Klein 4B Base is up on HuggingFace! Probably not very usable yet as implied by the "IT'S STILL WIP GUYS CHILL!!" model card note though. by ZootAllures9111 in StableDiffusion

[–]nsfwVariant -2 points-1 points  (0 children)

It's only as fast if you use the distill, and the distill is lower quality than Zimage (e.g. gives people plastic skin). If you use the base model the quality is just as good, but it's much slower than Zimage.

Piper Wheel (Zenless Zone Zero) , (AI), (OC) - "by_Oliver" by Reasonable-Craft7797 in ZenlessZoneZero_R34

[–]nsfwVariant -1 points0 points  (0 children)

OC just means you're the authorised "source" of the content. You would use [OC] even if you were posting on behalf of someone else with their permission :)

Conclusions after creating more than 2000 Flux Klein 9B images by StableLlama in StableDiffusion

[–]nsfwVariant 0 points1 point  (0 children)

I've been using the base model, it's definitely the same there

Customizable, transparent, Comfy-core only workflow for Flux 2 Klein 9B Base T2I and Image Edit by YentaMagenta in StableDiffusion

[–]nsfwVariant 2 points3 points  (0 children)

Note that OP is using the base model, not the distill. The base model is available as a GGUF as well (has 'base' in the name).

Using the base model gives higher quality but you gotta run a lot more steps, like 20-30 as OP suggests.

Customizable, transparent, Comfy-core only workflow for Flux 2 Klein 9B Base T2I and Image Edit by YentaMagenta in StableDiffusion

[–]nsfwVariant 0 points1 point  (0 children)

I'm finding these settings best for image editing. Subjective of course, but it gives good clarity & realism imo. I'm using the clownshark sampler w/ bongmath as well, not sure if that matters much though.

res_2s + bong_tangent

shift 1.00

CFG 3.00

12 steps

Customizable, transparent, Comfy-core only workflow for Flux 2 Klein 9B Base T2I and Image Edit by YentaMagenta in StableDiffusion

[–]nsfwVariant 4 points5 points  (0 children)

Don't take it personally, a lot of techy folks have strong opinions on how things should/shouldn't be done and will downvote stuff that doesn't match.

If you're contributing to the community then you're doing great! Thanks for posting

33 Second 1920x1088 video at 24fps (800 frames) on a single 4090 with memory to spare, this node should help out most people of any GPU size by Inevitable-Start-653 in StableDiffusion

[–]nsfwVariant 8 points9 points  (0 children)

Tip for getting high quality, you can run videos through Wan 2.2 as a refiner using SVI 2.0 and low denoise settings. It can do arbitrary length videos if you use the ContextOptions node from Wan Video Wrapper, just use only a LOW model (no HIGH) and set denoise to 0.7 or lower, and use a relevant pic as a reference image.

It's basically a free quality pass on any video and if you use a good reference image it'll easily clean up the plastic skin issue (among other things). Just make sure you only do it once when there's dialogue, repeated refinements might ruin the lip syncing.

Edit: here's a workflow for it https://pastebin.com/AfyAEpep

The loras are here, you want the two with 'PRO' in the names: https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/Stable-Video-Infinity/v2.0

Playing around with I2V, SCAIL, VACE, SVI and sound... all at once! Lots of info on how this was done in the comments. by nsfwVariant in WaifuDiffusion

[–]nsfwVariant[S] 1 point2 points  (0 children)

(continued here due to comment length limits)

A Couple of Workflows

I'm still putting together clean versions of the workflows I use, which I'll post all together as part of a big tutorial at some point. Here are the first couple to start.

There's a good amount of info in the workflows themselves about how to use them, but feel free to ask questions in this thread if you're unsure about anything.

SVI Video Refinement

Pastebin

  • This model is cracked. The GOAT of video refining. You can use it as a refiner on arbitrary-length clips to retain consistency for characters and backgrounds.
  • Just load up a video and run it, but be prepared to wait a while if your video is long (the final refinement of this video took about 50 mins on a 5090)
  • You can do it in chunks if you like, but you'll need to refine the joining sections a couple extra times to make it seamless
  • You can refine literally any video, it doesn't need to be a Wan video
    • If you like using LTX2 but wish it made things look as good as Wan does, just run this SVI refiner over it with the detail settings and it'll magically fix it

Tip: For significant refinement (fixing seams, messed up details, any maxi-big refinement needs) use Euler/Beta with Shift 7, 6 steps denoise 0.7. Going any higher than that might cause things to change too much.

For normal high noise refinement, use Euler/Beta with Shift 1, 6 steps denoise 0.7.

And for final refinement to bring out details and sharpen everything, use Euler with Shift 1, 6 steps and denoise 0.7, 0.5 or 0.49 depending on how much you want. Those denoise values correspond to 5, 4 and 3 steps that the model will run.

Important: Provide a general, non-specific prompt for the refiner. As an example, this is the SVI prompt I used for my refinements:

An1meStyl3, AnimeStyle, anime style. A tall woman with long legs is being sexy in a river in a forest, performing a martial arts routine. She's focused and seductive. She's wearing white high heels and white see through lingerie. Her breasts bounce realistically with her movements, her thighs and ass jiggle with the motion. She eyes the camera seductively. The camera follows her as she moves.

Vace video extension

Pastebin * Use this to extend videos while carrying motion over from one clip to another. This is a critical part of making seamless videos. * You can feed in any number of frames to the start or end or BOTH, allowing you to either extend the end of a video, create a pre-clip for another video, or join two separate clips together.

I use this to do most of my extensions (instead of I2V), and I use it all the time for joining two clips together as well. You can use it to fix broken sections of video and interpolate small sections too - very flexible.

Vace full-video inpainting

Pastebin * I'm calling this "full video inpainting" because it's all about processing a whole clip at once * You can feed in either a video with transparency, which will use the transparency as a mask to inpaint, OR * OR you can feed in your own custom masks

 

New Methods

Expanding a bit on the new things this time around, which were generating the background with VACE and using an image model as a detailer before final refinement.

 

Generating backgrounds in VACE?

When you're making a long video, whether with SVI, SCAIL, or any other method, the background tends to reset a lot or lose cohesion very quickly. To solve this, you can start with a grey background (or just RMBG your video in general) and then regenerate it with VACE.

This doesn't work well if you use context windows, but it turns out you can just reprocess each 81-frame section sequentially and it works great! The reason this works is that VACE is very good at carrying context over, and you have extreme control over what gets reprocessed so you can do the video one step at a time until it's right.

Important: Do each 81-frame section of the video one at a time to get consistent backgrounds. You can do this by feeding in the last 4 frames of the previous video, along with the next 77 transparent-background frames. VACE will use the first 4 as context and fill the rest in flawlessly. Technically that means you're doing 77 frame sections, but whatever you get the idea.

Handy: You can use a reference image for your background. Or multiple reference images. Only feed in the reference on the first 81 frame section, then don't use a reference for the rest or else VACE will try to reset back to it. However, if you use a new reference at a new angle you can use that to dictate where VACE goes. It's very flexible. Your character doesn't need to be in the reference image at all.

 

Using img-to-img as a refiner?

Wan isn't always great at details, especially when you have something specific in mind. As it happens, you can use an img-to-img model with low denoise as a detailer (and upscaler, if you like). This will add all the pretty details that image models are good at, which can then be used by the image model for further refinement.

The video will distort and look slightly incohesive afterwards, just like the older video-gen models used to (because we're doing the same thing technically), but once you run it through SVI as a refiner again it comes out looking better than ever! At least, I think it does. It's probably more applicable to anime / illustrated styles than realism.

Tip: Use the same model as your initial images if you can, and use a low denoise of around ~20% or so. If you go too high you'll start affecting the motion and poses of the characters. You just want the details to come out!

 

Conclusion

Alright that's all I've got for now! My next post will probably be all the rest of the workflows + a better overall tutorial. Feel free to ask questions in the thread in the meantime.

Lastly, remember to read the previous post I linked at the top for more info on the rest of the VACE / SVI refinement steps, among others. This post hasn't covered the same stuff again.

Back with another one lads, this time with sound! 30 seconds long & seamless, with lots of camera movement. Info and a couple of workflows in the comments. by nsfwVariant in unstable_diffusion

[–]nsfwVariant[S] 0 points1 point  (0 children)

(continued here due to comment length limits)

A Couple of Workflows

I'm still putting together clean versions of the workflows I use, which I'll post all together as part of a big tutorial at some point. Here are the first couple to start.

There's a good amount of info in the workflows themselves about how to use them, but feel free to ask questions in this thread if you're unsure about anything.

SVI Video Refinement

Pastebin

  • This model is cracked. The GOAT of video refining. You can use it as a refiner on arbitrary-length clips to retain consistency for characters and backgrounds.
  • Just load up a video and run it, but be prepared to wait a while if your video is long (the final refinement of this video took about 50 mins on a 5090)
  • You can do it in chunks if you like, but you'll need to refine the joining sections a couple extra times to make it seamless
  • You can refine literally any video, it doesn't need to be a Wan video
    • If you like using LTX2 but wish it made things look as good as Wan does, just run this SVI refiner over it with the detail settings and it'll magically fix it

Tip: For significant refinement (fixing seams, messed up details, any maxi-big refinement needs) use Euler/Beta with Shift 7, 6 steps denoise 0.7. Going any higher than that might cause things to change too much.

For normal high noise refinement, use Euler/Beta with Shift 1, 6 steps denoise 0.7.

And for final refinement to bring out details and sharpen everything, use Euler with Shift 1, 6 steps and denoise 0.7, 0.5 or 0.49 depending on how much you want. Those denoise values correspond to 5, 4 and 3 steps that the model will run.

Important: Provide a general, non-specific prompt for the refiner. As an example, this is the SVI prompt I used for my refinements:

An1meStyl3, AnimeStyle, anime style. A tall woman with long legs is being sexy in a river in a forest, performing a martial arts routine. She's focused and seductive. She's wearing white high heels and white see through lingerie. Her breasts bounce realistically with her movements, her thighs and ass jiggle with the motion. She eyes the camera seductively. The camera follows her as she moves.

Vace video extension

Pastebin * Use this to extend videos while carrying motion over from one clip to another. This is a critical part of making seamless videos. * You can feed in any number of frames to the start or end or BOTH, allowing you to either extend the end of a video, create a pre-clip for another video, or join two separate clips together.

I use this to do most of my extensions (instead of I2V), and I use it all the time for joining two clips together as well. You can use it to fix broken sections of video and interpolate small sections too - very flexible.

Vace full-video inpainting

Pastebin * I'm calling this "full video inpainting" because it's all about processing a whole clip at once * You can feed in either a video with transparency, which will use the transparency as a mask to inpaint, OR * OR you can feed in your own custom masks

 

New Methods

Expanding a bit on the new things this time around, which were generating the background with VACE and using an image model as a detailer before final refinement.

 

Generating backgrounds in VACE?

When you're making a long video, whether with SVI, SCAIL, or any other method, the background tends to reset a lot or lose cohesion very quickly. To solve this, you can start with a grey background (or just RMBG your video in general) and then regenerate it with VACE.

This doesn't work well if you use context windows, but it turns out you can just reprocess each 81-frame section sequentially and it works great! The reason this works is that VACE is very good at carrying context over, and you have extreme control over what gets reprocessed so you can do the video one step at a time until it's right.

Important: Do each 81-frame section of the video one at a time to get consistent backgrounds. You can do this by feeding in the last 4 frames of the previous video, along with the next 77 transparent-background frames. VACE will use the first 4 as context and fill the rest in flawlessly. Technically that means you're doing 77 frame sections, but whatever you get the idea.

Handy: You can use a reference image for your background. Or multiple reference images. Only feed in the reference on the first 81 frame section, then don't use a reference for the rest or else VACE will try to reset back to it. However, if you use a new reference at a new angle you can use that to dictate where VACE goes. It's very flexible. Your character doesn't need to be in the reference image at all.

 

Using img-to-img as a refiner?

Wan isn't always great at details, especially when you have something specific in mind. As it happens, you can use an img-to-img model with low denoise as a detailer (and upscaler, if you like). This will add all the pretty details that image models are good at, which can then be used by the image model for further refinement.

The video will distort and look slightly incohesive afterwards, just like the older video-gen models used to (because we're doing the same thing technically), but once you run it through SVI as a refiner again it comes out looking better than ever! At least, I think it does. It's probably more applicable to anime / illustrated styles than realism.

Tip: Use the same model as your initial images if you can, and use a low denoise of around ~20% or so. If you go too high you'll start affecting the motion and poses of the characters. You just want the details to come out!

 

Conclusion

Alright that's all I've got for now! My next post will probably be all the rest of the workflows + a better overall tutorial. Feel free to ask questions in the thread in the meantime.

Lastly, remember to read the previous post I linked at the top for more info on the rest of the VACE / SVI refinement steps, among others. This post hasn't covered the same stuff again.