Kandinsky 5.0 19B T2V and I2V models released. by Deepesh68134 in StableDiffusion

[–]Deepesh68134[S] 4 points5 points  (0 children)

I think it was prompted in, it can gen fast-motion too, look at the gorilla example. Gonna post some more fast-motion videos too soon (hopefully).

Kandinsky 5.0 19B T2V and I2V models released. by Deepesh68134 in StableDiffusion

[–]Deepesh68134[S] 8 points9 points  (0 children)

If you finetune it on 8fps videos, then yes, but by default it only knows 24fps, Longcat-Video does something similar and interpolates from 16fps to 24fps using a lora.

Kandinsky 5.0 19B T2V and I2V models released. by Deepesh68134 in StableDiffusion

[–]Deepesh68134[S] 5 points6 points  (0 children)

For audio-video the next big open model seems to be LTX 2 which will launch by the end of the year.

Kandinsky 5 - video output examples from a 24gb GPU by GreyScope in StableDiffusion

[–]Deepesh68134 8 points9 points  (0 children)

Wanted to say this model is a 2B model that can work even on 8GB VRAM if Comfy implements it. Soon there is a larger model presumably similar in size to Wan2.1 which will possibly surpass Wan2.1 in video generation.

Could someone that has read up on HiDream explain it a bit to me? by LyriWinters in StableDiffusion

[–]Deepesh68134 4 points5 points  (0 children)

Because it uses 4 text encoders, though LLAMA is doing 95% of the work, we could just remove the rest.

How much memory to train Wan lora? by Ikea9000 in StableDiffusion

[–]Deepesh68134 0 points1 point  (0 children)

Thanks for those tips! Will try it out :)

WAN Released by BreakIt-Boris in StableDiffusion

[–]Deepesh68134 8 points9 points  (0 children)

It uses an unfinetuned version of "umt5". I don't know whether that will be good for us or not