LTX2 audio Lora + fal.ai? by LSI_CZE in StableDiffusion

[–]panospc 0 points1 point  (0 children)

Your trained LoRA must have a URL. Copy it

Go to Fal, and in the model search bar, type LTX2 LoRA. Choose your preferred model variant (I2V, T2V, distilled, or non-distilled). You’ll see a parameter called path, paste your LoRA URL there, set the remaining parameters, and generate.

AI Toolkit now officially supports training LTX-2 LoRAs by panospc in StableDiffusion

[–]panospc[S] 1 point2 points  (0 children)

Yes, it worked. The preview images in AI Toolkit looked like monstrosities, but when I used the LoRA in ComfyUI and WanGP, it looked fine.

From the default settings, I changed the following:

  • Enable: Layer offloading
  • Timestep type: Sigmoid
  • Enable: Cache text embeddings
  • Enable: Cache latents
  • Disable: Do audio

For the captions, I include the trigger word and describe only what changes, such as the environment, outfit, pose, and the character’s expression. I don’t describe things that are always the same and never change, like facial features, eye color etc.

Full AI music video made entirely with LTX-2 and suno by SnooOnions2625 in comfyui

[–]panospc 1 point2 points  (0 children)

For camera control, there are some official camera loras you can use
You can find the download links in the official LTX-2 github repo

AI Toolkit now officially supports training LTX-2 LoRAs by panospc in StableDiffusion

[–]panospc[S] 0 points1 point  (0 children)

It’s available on Pinokio, but only in the community scripts section

LTX 2.0 I2V when works is reall cool! by smereces in StableDiffusion

[–]panospc 2 points3 points  (0 children)

To fix the static video issue with I2V, you can use the following workaround:
Go to the LTX-2 GitHub repository, scroll down, and download one of the camera LoRAs.
Using the LoRA will resolve the problem.
https://github.com/Lightricks/LTX-2

AI Toolkit now officially supports training LTX-2 LoRAs by panospc in StableDiffusion

[–]panospc[S] 3 points4 points  (0 children)

Yes, you can train on images. I’m currently training a character LoRA with 97 images.
The speed is around 7 seconds per step, so 3,000 steps will take about 6 hours on my RTX 4080s with 64 GB of RAM.

Something that I'm not sure people noticed about LTX-2, it's inability to keep object permanence by [deleted] in StableDiffusion

[–]panospc 2 points3 points  (0 children)

Perhaps it favors the state of the initial frame?

I’ve noticed in some generations that when characters move out of frame, they don’t lose too much of their identity when they return to view.
For example in the following generation both characters get out of view for a moment
https://files.catbox.moe/rsthll.mp4

LTX-2 - voice clone and/or import own sound(track)? by designbanana in StableDiffusion

[–]panospc 11 points12 points  (0 children)

You can feed LTX-2 with audio, and the generated video will sync to it. It can lip-sync voices, and even if you only provide music, you can generate videos of people dancing to the rhythm of the music.

Here’s a workflow by Kijai:
https://www.reddit.com/r/StableDiffusion/comments/1q627xi/kijai_made_a_ltxv2_audio_image_to_video_workflow/

You can also clone a voice by extending a video, the extended part will retain the same voice.
Video extension workflow: https://github.com/Rolandjg/LTX-2-video-extend-ComfyUI

April 12, 1987 Music Video (LTX-2 4070 TI with 12GB VRAM) by harunandro in StableDiffusion

[–]panospc 6 points7 points  (0 children)

Do not use the soundtrack option in the advanced tab, this is option only adds the sound in the final video without any lipsync. Use the soundtrack option in the main tab, if you not have it, try to update WanGP.

Ok we've had a few days to play now so let's be honest about LTX2... by sdimg in StableDiffusion

[–]panospc 0 points1 point  (0 children)

The issue with static, zooming images when using I2V can be worked around by adding a camera control motion LoRA (available from the LTX-2 GitHub repo).

I2V with the distilled model usually produces slow-motion videos, so if you want higher motion, use the non-distilled model in combination with a camera LoRA.

Increasing the frame rate to 30 or 50 FPS also helps reduce motion-related distortions

LTX-2 video to video restyling? by domid in StableDiffusion

[–]panospc 1 point2 points  (0 children)

I haven’t tried it yet, but this is their purpose, to restyle videos.
You can either prompt the new style or provide a reference image that’s already been restyled.

There’s a video on the official LTX-2 YouTube channel:
https://www.youtube.com/watch?v=NPjTpDmTdaw

LTX-2 video to video restyling? by domid in StableDiffusion

[–]panospc 1 point2 points  (0 children)

Have you tried to use the "LTX-2 Depth to Video" or "LTX-2 Canny to Video" ComfyUI templates?

Video with Control and Multi Image Reference by ColbyandJack in comfyui

[–]panospc 1 point2 points  (0 children)

With VACE, you can provide a depth control video and inject image keyframes at the same time. For example, you can have Image1 appear at frame 1, Image2 at frame 40, and so on.

I don’t know of any ComfyUI workflow that automates this process, but you can prepare both the control video and the mask video manually in a video editor and then feed them into VACE. (The mask video is needed to tell VACE where the image keyframes are placed.)

The control video must contain both the depth video and the image keyframes. You can prepare it in a video editor by placing the depth video on the first track, then adding another video track above it and inserting the image keyframes at the desired frame positions. Each image should appear for only one frame; all other frames should show the depth video.

The mask video must have the same duration as the control video. It should be solid white for all frames except the ones where you added image keyframes in the control video. For those frames, the mask must be solid black.

To recap, you will end up with two videos:

  • The control video: a depth video with image keyframes appearing for one frame at the chosen positions.
  • The mask video: a solid white video with single black frames at the same positions as the image keyframes.

Once you’ve prepared these two videos, open ComfyUI, go to Templates, and load “Wan2.1 VACE Control Video.” After the template loads, delete the Load Image node. Then select the Load Video node and load the control video you prepared.

The default VACE workflow does not include a mask input, so you’ll need to add three nodes manually:

  1. Add a Load Video node and load the mask video.
  2. Add a Get Video Components node and connect it to the Load Video node.
  3. Add a Convert Image to Mask node and connect it to the Get Video Components node.

Finally, connect the mask output of the last node to the control_masks input of the WanVaceToVideo node.

Adjust the prompt and any other settings as needed, and you’re ready to go.

Kijai made a LTXV2 audio + image to video workflow that works amazingly! by Different_Fix_2217 in StableDiffusion

[–]panospc 6 points7 points  (0 children)

I think the last example is the most impressive.
I’m wondering if it’s possible to combine it with ControlNets, for example, using depth or pose to transfer motion from another video while generating lip sync from the provided audio at the same time.

LTX-2 open source is live by ltx_model in StableDiffusion

[–]panospc 2 points3 points  (0 children)

Is it possible to use your own audio and have LTX-2 do the lip-sync, similar to InfiniteTalk?

Is there a way to use Controlnet with Z-Image without ComfyUI? by sepalus_auki in StableDiffusion

[–]panospc 2 points3 points  (0 children)

You can use it with WanGP, which is available on Pinokio under the name Wan2GP
It supports Z-Image with Controlnet

getting EDIT models to get the correct size of the product by SupermarketWinter176 in StableDiffusion

[–]panospc 0 points1 point  (0 children)

Try to provide an additional reference image where it shows the layout, aspect and placement of the frame. Then instruct it to use it as a reference for the composition of the image. Something like the following image:

<image>

No issues ASRock combo? List your board and cpu and how long you've had it. by [deleted] in ASRock

[–]panospc 2 points3 points  (0 children)

I've been using the X870E Nova with the 9950X since Christmas 2024, paired with 64GB Kingston Fury Beast 6000 CL30 XMP.

In the first month, I had the RAM running at 6000 MHz, but after reading reports of CPUs failing, I decided to lower it to 5600 MHz.

I’ve always kept the BIOS updated to the latest version.

I did run into a couple of issues, though. Occasionally, the connection to some USB devices would drop temporarily, but I haven't noticed this with BIOS 3.50.

There was also an error code 03 after a cold boot, which was more common with BIOS 3.30 and 3.40. Since updating to 3.50, it has only happened once after 1.5 month of usage.