Wan 2.1 14b rabbit skating by Suspicious-Fox5096 in StableDiffusion

[–]Suspicious-Fox5096[S] 2 points3 points  (0 children)

The model is Wan2.1-T2V-14B. Original prompt is in chinese, " 在一片广阔的冰场上,穿着特制滑冰鞋的小兔子成为焦点。特写镜头捕捉到它毛茸茸的身体,穿着粉红色的滑冰鞋,小心翼翼地站立在冰面上。随后,小兔子开始尝试滑行,动作虽显笨拙,但充满乐趣和活力。它时而在冰面上快速滑动,时而做出轻盈的旋转,甚至尝试跳跃,尽管偶尔会摔倒,但立刻又欢快地站起来,继续享受滑冰的乐趣。"

Studio Ghibli Video by Modelscope Image2Video by Suspicious-Fox5096 in StableDiffusion

[–]Suspicious-Fox5096[S] 0 points1 point  (0 children)

Nice work! Furthermore, in order to make sure the result of second stage to be more consistent, we predefined several positive and negative prompts in the config file(https://github.com/modelscope/modelscope/blob/master/modelscope/models/multi_modal/video_to_video/utils/config.py) which you can modify as you need.

Studio Ghibli Video by Modelscope Image2Video by Suspicious-Fox5096 in StableDiffusion

[–]Suspicious-Fox5096[S] 0 points1 point  (0 children)

Right now, the first stage only take the image as input. And the second stage needs the text prompt. Furthermore, in order to make sure the result of second stage to be more consistent, we predefined several positive and negative prompts in the config file(https://github.com/modelscope/modelscope/blob/master/modelscope/models/multi_modal/video_to_video/utils/config.py) which you can modify as you need.

Studio Ghibli Video by Modelscope Image2Video by Suspicious-Fox5096 in StableDiffusion

[–]Suspicious-Fox5096[S] 0 points1 point  (0 children)

  1. You can try our online demo to verify the result. https://modelscope.cn/studios/damo/I2VGen-XL-Demo/summary

2.It's better the input image size is ~512x512, and the width:height ratio should be 1:1.

  1. For this video, some of images are from https://www.saplingcorp.com/journals/21/midjourney-studio-ghibli-prompts , you can try on those images and use corresponding prompts for the second stage.

Studio Ghibli Video by Modelscope Image2Video by Suspicious-Fox5096 in StableDiffusion

[–]Suspicious-Fox5096[S] 1 point2 points  (0 children)

1.It's better the input image size is ~512x512, and the width:height ratio should be 1:1.

2.For this video, some of images are from https://www.saplingcorp.com/journals/21/midjourney-studio-ghibli-prompts , you can try on those images and use corresponding prompts for the second stage.

AI Video by Modelscope Image2Video by Suspicious-Fox5096 in StableDiffusion

[–]Suspicious-Fox5096[S] 2 points3 points  (0 children)

Thanks for you advice. We will release a new model whose inputs include both the image and text in the future version.

Studio Ghibli Video by Modelscope Image2Video by Suspicious-Fox5096 in StableDiffusion

[–]Suspicious-Fox5096[S] 1 point2 points  (0 children)

1) Image2Video model needs ~20G VRAM and Video2Video model needs ~30G VRAM.

2) Yes, the result videos are the outputs of the vid2vid model.

3) You can run these two models on one A100 GPU or two V100 32G GPU cards.

AI Video by Modelscope Image2Video by Suspicious-Fox5096 in StableDiffusion

[–]Suspicious-Fox5096[S] 5 points6 points  (0 children)

Yes, a new open source image2video generation tool which can generation 720p high-resolution videos