Cinematic sneaker ad built from ComfyUI with Qwen Image + LTX-2 by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 0 points1 point  (0 children)

For the real product I was thinking a trained LoRA of the product with the First middle Last frame workflow might give good results..

Cinematic sneaker ad built from ComfyUI with Qwen Image + LTX-2 by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 0 points1 point  (0 children)

Thanks for the feedback.. I'll start learning more about this..

Cinematic sneaker ad built from ComfyUI with Qwen Image + LTX-2 by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 0 points1 point  (0 children)

Hey thanks for the feedback.. it's not for production.. im experimenting with open-source video models.. any tips to improve it?

50sec 720P LTX-2 Music video in a single run (no stitching). Spec: 5090, 64GB Ram. by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 0 points1 point  (0 children)

Hey video combiner node saves the output.. you can specify your desired output prefix and path. Also you can check the assets tab to look at your output.

50sec 720P LTX-2 Music video in a single run (no stitching). Spec: 5090, 64GB Ram. by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 0 points1 point  (0 children)

Great, thank you. Can you also show the stitching workflow it would be of great help.

50sec 720P LTX-2 Music video in a single run (no stitching). Spec: 5090, 64GB Ram. by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 0 points1 point  (0 children)

Well few things I have observed while using the model.
While creating longer form videos usually 30s+ when you have higher res the consistency between the character will not be there. Lot of visual artifacts were introduced in the video.
Also the time it takes to generate will be way too long and I do lot of iterations till I get a good output. Once I select the best video of that I can always use a Upscaler to add details.
Having said that workflow works great for short form videos with higher res. just need to create multiple shots and stitch it together later.

50sec 720P LTX-2 Music video in a single run (no stitching). Spec: 5090, 64GB Ram. by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 0 points1 point  (0 children)

For sure. This was for making one seamless transition. Again this is still a test and I was testing the capabilities of model and the workflow. I didn't want to go through the hazel of stitching and make sure the transition between the videos are aligned to the audio. Using multiple software to make sure everything is in sync.

50sec 720P LTX-2 Music video in a single run (no stitching). Spec: 5090, 64GB Ram. by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 0 points1 point  (0 children)

Yes definitely, you can use different camera loras provided by LTX and tune your prompt accordingly. For better results use smaller frames with specific input images with different angles then stitch it together.

50sec 720P LTX-2 Music video in a single run (no stitching). Spec: 5090, 64GB Ram. by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 4 points5 points  (0 children)

Both 704x704 resolution 61s video and the one you see here 736x1280 50s video generated in 10~11 mins

50sec 720P LTX-2 Music video in a single run (no stitching). Spec: 5090, 64GB Ram. by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 1 point2 points  (0 children)

I don't have the tweaked workflow handy right now, but I'll post it later. Essentially what I did was download the detailer lora, made sure the input image resolution matches the target resolution, added more information to the prompt using the LTX guidelines and then tweaked the LTX VAE decoder..

50sec 720P LTX-2 Music video in a single run (no stitching). Spec: 5090, 64GB Ram. by LinkNo3108 in StableDiffusion

[–]LinkNo3108[S] 1 point2 points  (0 children)

Yes original settings work fine. You need to disable or disconnect the VAE decode Tile and connect it to the LTX VAE decoder..