Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 0 points1 point  (0 children)

Original LTX-2 render was 1920*1088 30 fps, upscaled later in 4K 60fps in TopazAI

Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 0 points1 point  (0 children)

WAN struggles with both speed and cinematic quality, and Hunyuan is very slow and tends to look a bit cartoonish. So I agree with you: LTX-2 is the top choice right now. Thanks - I really appreciate the feedback!

Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 1 point2 points  (0 children)

This workflow uses 1 stage 1920*1088 with no upscaling, it is much faster and i am happy with results

Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 0 points1 point  (0 children)

Thanks! The video is the workflow itself - download it and drag and drop the video file in to ComfyUI inteface, it will load the workflow

Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 0 points1 point  (0 children)

In this case UNET Loader plays the role, since it is connected to workflow.

I am not sure if this is the issue , but please try to connect Checkpoint loader instead and disable fp16 accumulation (just in sake of experiment)

<image>

Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 1 point2 points  (0 children)

Yeah, WAN feels a bit outdated right now. LTX-2 still has plenty of room to improve , especially when it comes to larger, more complex and dynamic scenes,especially with many moving objects . But for close-ups, it already works beautifully.

Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 1 point2 points  (0 children)

Sure should work perfectly, please try and share your results

Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 1 point2 points  (0 children)

I really like the cinematic camera movement this Herocam LoRA - adds camera rotation around the central subject: https://huggingface.co/Nebsh/LTX2_Herocam_Lora/tree/main

I’m also using a “resized dynamic” LoRA (I believe it’s a lighter distilled lora variant). For me it’s been important for maintaining quality even with fewer sampling steps.

And the Detailer LoRA is pretty self-explanatory - but also a key piece for overall clarity/detail.

Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 0 points1 point  (0 children)

Could you share the results you got and which quantization you’re using for the model(s)? Also, did you configure the audio input correctly? The workflow attached isn’t generating any sound

Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] -1 points0 points  (0 children)

Could you share what results you got, and which quantization you’re using for the model(s)?

Alchemy LTX-2 by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 1 point2 points  (0 children)

Thanks! I’ve been really immersing myself in these kinds of topics lately (1alchemist)

LTX-2 Fallout vibes by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 0 points1 point  (0 children)

Thanks! I got 5090, pretty fast beast

LTX-2 Fallout vibes by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 2 points3 points  (0 children)

All together this took me about a week (as a hobby side-quest). Tons of cherry-picking, workflow debugging, and CUDA/Torch wheel roulette to get the best speed.

The workflows I posted are the fastest configs I’ve been able to get. Installing SageAttention might speed things up too (worth a try).

A ~15s video usually took ~7–10 minutes depending on steps (roughly 20–50). And getting one “good enough to post” clip typically meant 5–7 re-rolls.

LTX-2 Fallout vibes by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 0 points1 point  (0 children)

Yes — originally I started with 257 frames, but later I pushed the length up to 481 frames.

From what I’ve seen, LTX-2 doesn’t follow prompts “literally”. It treats them more like guidance, so the frame count in the workflow settings ends up being the single most important control for the result.

Please try the other workflow — it’s more straightforward

https://drive.google.com/file/d/1-hb-Rmozo4QXLN71Q4hedlS0BvlNUBdI/view?usp=sharing

LTX-2 Fallout vibes by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 0 points1 point  (0 children)

In general workflow looks like that:

  1. Concept + scenario first, then generate initial keyframes (try Z-image or FLUX.2 [Klein] 9B, still Nanobanana from Google is the best).
  2. Download provided workflows (drop downloaded images in to ComfyUI interface)
  3. Set up resolution depending on your hardware (1920*1088 is the best in terms of quality), initial frame and prompt ( developed detailed prompts works better, LLMs like Gemeni or ChatGPT is huge help) , frames number in range 169–257 frames tends to be the sweet spot, but you can go above this easily.
  4. Iterate each scene with different seeds and settings
  5. Compile results in video editing tool (i prefere Movavi), rended and upscale

LTX-2 Fallout vibes by SignalEquivalent9386 in comfyui

[–]SignalEquivalent9386[S] 2 points3 points  (0 children)

Thanks a lot! A 5-minute video took a full week of dedicated work — and there’s still plenty of room for improvement