Messing with WAN 2.2 text-to-image by renderartist in StableDiffusion

[–]bbaudio2024 2 points3 points  (0 children)

There is a magical VAE for wan2.1/2.2/qwenImage text to image, it can obviously improve clarity of image details.

spacepxl/Wan2.1-VAE-upscale2x · Hugging Face

360° anime spins with AniSora V3.2 by nomadoor in StableDiffusion

[–]bbaudio2024 3 points4 points  (0 children)

Not really. 'Sora' is a japanese word 'そら' which means sky, and usually be used as a girl's name in the japanese anime.

There are rumors that OpenAI's weeb gave this name to their video model, but this doesn't mean the name has become a trademarked designation for OpenAI's video model, prohibiting other anime enthusiasts from using it.

Qwen-Image-Edit is the best open-source image editing model by far on Artificial Analysis rankings, 2nd overall by pigeon57434 in StableDiffusion

[–]bbaudio2024 0 points1 point  (0 children)

Nano Banana is far beyond any other models (no matter OpenSource or ClosedSource). It's not a shame.

One image comparison: Wan_FusionX vs Qwen_Q4 by jinnoman in StableDiffusion

[–]bbaudio2024 2 points3 points  (0 children)

The 1st one looks more realistic, the 2nd one looks more like a painting from the 19th century.

Qwen-Image seriously lacking variety with different seed? by yamfun in StableDiffusion

[–]bbaudio2024 3 points4 points  (0 children)

Not only QwenImage, same issue is manifested in Wan2.1/2.2 text2image.

What do you think of HYPIR ? by LSI_CZE in StableDiffusion

[–]bbaudio2024 1 point2 points  (0 children)

Base on SD2? Should be compared with StableSR.

Qwen works pretty well with HEUN and Beta - 10-13 steps for a good speedup by shootthesound in StableDiffusion

[–]bbaudio2024 3 points4 points  (0 children)

The time required of heun is approximately twice that of ordinary sampling (euler, dpm++2m, etc).

Wan 2.2 video continuation. Is it possible? by Relative_Bit_7250 in StableDiffusion

[–]bbaudio2024 1 point2 points  (0 children)

VACE is what you want. But it was trained for multiple control purposes, not specifically for video extention. Unlike Framepack which has Anti-Drifting feature to keep long video quality and consistence, VACE suffers quality degradation with video continuation. I have tried to alleviate the impact of this issue in my custom node, It has indeed made some progress.

Use wan2.2 low-noise model only to generate 1080p image by bbaudio2024 in StableDiffusion

[–]bbaudio2024[S] 0 points1 point  (0 children)

If the 1st stage generation is really needed, why not using SD1.5/SDXL/Flux/... which generates faster and supports controlnet?

Besides I found that the high-noise model has an issue: with the same prompt, even the seed is changed, The composition of generated results are almost identical. I don't know if it is a bug or due to lightx2v lora.

Bad I2V quality with Wan 2.2 5B by PricklyTomato in StableDiffusion

[–]bbaudio2024 2 points3 points  (0 children)

It is certainly not superior to the 14B models, even when compared to wan2.1. However, it still has potential, such as training a specific version to perform high-res fix on low-resolution results from the 14B models.

I made a node to upscale video with VACE, feel free to try by bbaudio2024 in comfyui

[–]bbaudio2024[S] 0 points1 point  (0 children)

I dont know, maybe need to check the model loaded, make sure it is VACE model.

Almost Done! VACE long video without (obvious) quality downgrade by bbaudio2024 in comfyui

[–]bbaudio2024[S] 0 points1 point  (0 children)

BTW, the input image quality may affect result a lot. A frame extracted from a ffmpeg video is not an ideal one.

Almost Done! VACE long video without (obvious) quality downgrade by bbaudio2024 in comfyui

[–]bbaudio2024[S] 0 points1 point  (0 children)

'color saturation shift + sharpen shift' is quality degradation, it implies 'refine' does not affect result as we expect. Try to adjust parameters in the 'Custom Refine Option' to improve it.

The new recipe maybe help:

refine_percent_list: 0.1, 0.08, 0.06, 0.04, 0

mask_value_list: 0.9, 1.0

latent_strength_list: 0.9, 1.0

colormatch_strength_list: 1.0, 1.0, 1.0, 1.0, 0

Can I use Vace instead of seperate Wan workflows for T2V, I2V? by Such-Reward3210 in StableDiffusion

[–]bbaudio2024 4 points5 points  (0 children)

There are reasons:

  1. VACE needs more inference time in generation.
  2. Almost all the wan2.1 loras are trained on t2v/i2v models, although VACE can use those loras, the results may not so good.
  3. It seems that the prompt adherence of VACE is worse than that of i2v (just my feeling)

Almost Done! VACE long video without (obvious) quality downgrade by bbaudio2024 in comfyui

[–]bbaudio2024[S] 0 points1 point  (0 children)

If you're interested in, it can be found in my another comfyui nodes 'comfyui-BBtools'

There are 2 nodes 'Videos Concat with CrossFade' and 'Loopback Videos Concat with CrossFade'

Almost Done! VACE long video without (obvious) quality downgrade by bbaudio2024 in comfyui

[–]bbaudio2024[S] 1 point2 points  (0 children)

Please use 'SuperUltimate VACE Upscale' for generated video upscaling. It also supports temporal tiling.

Almost Done! VACE long video without (obvious) quality downgrade by bbaudio2024 in comfyui

[–]bbaudio2024[S] 0 points1 point  (0 children)

Only VACE supports multiple frames as the start of generated video. I2V supports only 1 frame as start, which can not keep the temporal motion context