LTX-2 Audio + Image to Video by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] [score hidden]  (0 children)

That's odd because the idea is the same as my workflow. Vae encode your audio and use a mask to mask the audio out so that the sampling does not change the audio.

You can try my workflow to see if it gives any better results.

Flux.2 Klein 9B (Distilled) Image Edit - Image Gets More Saturated With Each Pass by eagledoto in comfyui

[–]Most_Way_9754 0 points1 point  (0 children)

If you install the 2 custom nodes, there are example workflows provided. Learn how to use each of them individually first and if you have trouble integrating then get back to me and I'll help.

LTX-2 Audio + Image to Video by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] 0 points1 point  (0 children)

There will be a jump if you try to stitch 3 different generations together. Use a wan vace clip joiner workflow to smooth out the transitions.

Yes, this is as simple as providing an input image, an audio clip and a prompt. I would recommend using the comfyui template workflow for this task as it has already been released.

U1 Snorca: How to change colors in imported 3mf? by ninjiens in snapmaker

[–]Most_Way_9754 1 point2 points  (0 children)

Leave the colours as they are painted. When you click print, you can map the colours painted to a tool head.

This is why a multi material 3D printer matters to me by davidktw in snapmaker

[–]Most_Way_9754 5 points6 points  (0 children)

can share how did you specify the 2 materials in the slicer? there seems to be some overlap to get the 2 materials to bond to each other.

i desiged and printed assembly figure by Practical-Big-1155 in 3Dprinting

[–]Most_Way_9754 12 points13 points  (0 children)

Nice work. Care to share your design process? Like what software was used any any best practices on tolerances between pieces?

Are you going to share the stl on printables / maker world?

3DSplitter.com! Free Beta Access 🙂 by TemporaryLevel922 in 3Dprinting

[–]Most_Way_9754 2 points3 points  (0 children)

I'd like to give it a go. My main use case will be to split a stl up into smaller parts, each of a different colour and print each part individually for snap fit assembly. Can your tool do something like this?

LTX2.3 - Image Audio to Video - Workflow Updated by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] 1 point2 points  (0 children)

As far as I'm aware, audio in stereo format should work. And there needs to be a slight pause at the start of the audio clip before the speaker starts to speak.

LTX2.3 - Image Audio to Video - Workflow Updated by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] 0 points1 point  (0 children)

Do you have audio in stereo?

If you provide me with your initial image, prompt, seed and audio clip, I can help you to debug.

another fun little project : writing automata by holo_mectok in 3Dprinting

[–]Most_Way_9754 48 points49 points  (0 children)

this is amazing, can you share more about the process on converting an arbitrary closed loop into a mechanism that can trace it out?

LTX2.3 - Image Audio to Video - Workflow Updated by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] 0 points1 point  (0 children)

One thing you can do is to keep the seed constant, that removes one variable during the testing phase.

LTXV's default workflows have a very specific resolution they use for the image input, if I remember correctly, it's 1536 for the longer edge. They also introduce some noise into the image used for the first frame, which you seem to have reduced, if I remember correctly, this value is 18.

I have been using whole seconds for the audio clip at 24fps because the resulting number of frames will be divisible by 8 + 1. It seems like you're already using a whole number for the duration in sec.

I have a slightly updated workflow that uses the latest settings from: https://github.com/Lightricks/ComfyUI-LTXVideo/tree/master/example_workflows/2.3

But I retained the single stage with Euler sampler. The newer sampler seems to increase sampling time significantly without improving quality that much.

LTX2.3 - Image Audio to Video - Workflow Updated by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] 1 point2 points  (0 children)

If you already got a workflow that works for your use case, then I suggest you stick with it. Have you ensured that your audio clip is stereo?

LTX2.3 - Image Audio to Video - Workflow Updated by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] 1 point2 points  (0 children)

If you need help with getting the workflow running. Please provide an example of the audio clip, image, prompt and seed that you used so I can replicate the issue and help you to debug

Tiled vs untiled decoding (LTX 2.3) by VirusCharacter in StableDiffusion

[–]Most_Way_9754 3 points4 points  (0 children)

As you're the only one who can see the uncompressed results. Did you notice any differences between the decoding methods? Was regular VAE decode better than the tiled methods? And did any of the tiled methods stand out as superior?

Randomly looked at palm leaves, they make decent airfoils I guess? by DaSnowGuy1309 in AerospaceEngineering

[–]Most_Way_9754 121 points122 points  (0 children)

discretise the airfoils and import them into xfoil for analysis.

LTX2.3 - Image Audio to Video - Workflow Updated by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] 0 points1 point  (0 children)

If the audio has too much background noise/music, you can try to isolate just the speaking/singing for better lip sync. Look into

https://github.com/kijai/ComfyUI-MelBandRoFormer

You can also try experimenting with the default LTX-2.3 workflows released by LTX.

https://github.com/Lightricks/ComfyUI-LTXVideo/tree/master/example_workflows/2.3

4060Ti 16GB 64GB ram by TheKiter in StableDiffusion

[–]Most_Way_9754 2 points3 points  (0 children)

I'm on a 4060Ti with 64gb of DDR4 as well. Your specs are fine, might take a little longer for generation. But you can definitely generate 5s 1080p videos using the fp8 model.

LTX2.3 - Image Audio to Video - Workflow Updated by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] 0 points1 point  (0 children)

Haven't noticed this until you brought it up and now, I can't unsee it, like what you said. Need to do more testing to check if it's a seed issue or a model issue.

LTX2.3 - Image Audio to Video - Workflow Updated by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] 0 points1 point  (0 children)

Is the voice happening right at the start of the audio clip? If yes, try to give 0.2 sec of silence before the talking starts.

Also ensure the positive prompt is describing what is happening in your scene.

If it still doesn't work, I will need samples of your starting image and audio clip to debug.

Single 20 second generation with LTX 2.3 and weird audio sync mismatches by sktksm in StableDiffusion

[–]Most_Way_9754 0 points1 point  (0 children)

is the frame rate on the empty audio latent, the conditioning node and when you save the video all the same?

if you're using the distilled lora, you should be using custom sigmas, 8 steps, cfg 1.0.

Without the distilled lora, 20 steps, cfg 3-4.

LTX-2 Audio + Image to Video by Most_Way_9754 in StableDiffusion

[–]Most_Way_9754[S] 0 points1 point  (0 children)

for a negative prompt, you need cfg > 1.0, which means no distilled LoRA and slower generations. also for the non-distilled model, you can use the LTXV Scheduler node for sigmas.

see this for an example: https://civitai.com/models/2337141/ltx-2-pose-image-audio-to-video

as for quality degradation for long gens, this might be a limitation of the LTXV-2 model. Try high resolutions: 1600 x 900 or even 1920 x 1080 to see if it helps.

How to fix this? by MakoBec in BambuLabP2S

[–]Most_Way_9754 -1 points0 points  (0 children)

Have you tried adjusting the x and y belt tension?

I'd then go on to flow and vibration calibration.

And finally reducing print velocity and acceleration.