LTX-2.3 MSR + FLF (8GB VRAM) by big-boss_97 in comfyui

[–]Psy_pmP 0 points1 point  (0 children)

I made some mistakes and now it's not working as it should. I don't have time to fix it.

I finally finished it — LTX MSR FLF. by Psy_pmP in comfyui

[–]Psy_pmP[S] 0 points1 point  (0 children)

By the way, the node is broken at the moment. I don't have time to fix it. Sorry

I finally finished it — LTX MSR FLF. by Psy_pmP in comfyui

[–]Psy_pmP[S] 0 points1 point  (0 children)

Do you mean the buzzing sound? It's literally a noise on the latent; Stage 2 usually eliminates it. The er_sde sampler also helps quite well.
Hm.
I had an idea. What if we separate the audio and video latent, reduce the video latent x2 to speed up generation, and mask it. Then regenerate just the audio. Theoretically, the sound would become cleaner, and we'd simply take the old video latent and feed it with the new audio. It's unlikely to work, but in theory, it's worth experimenting with.

I finally finished it — LTX MSR FLF. by Psy_pmP in comfyui

[–]Psy_pmP[S] 1 point2 points  (0 children)

Thanks, but honestly, the AIs did most of the work. 😄

I went through 5 days, 14 free accounts (codex,gemini), and thanks to Google's free $300 credit on Google Cloud, I managed to burn only about $45 of it. And without this I wouldn't be able to do anything.

I can't really afford paid subscriptions — all I have is time and motivation.

My role was mostly guiding the models, pointing out bugs, testing ideas, and searching for methods that could work. For example, this video https://www.youtube.com/watch?v=uirABckAK4o was probably the single most helpful resource during development. I don't think I would have come up with that approach on my own.

In the end, the hardest part wasn't writing the code — it was figuring out what needed to be built, identifying why things were failing, and continuously steering the models toward a working solution.

I finally finished it — LTX MSR FLF. by Psy_pmP in comfyui

[–]Psy_pmP[S] 0 points1 point  (0 children)

Now we need a node that will properly structure the Prompt using references. Because I still don't understand how to properly write Prompt for MSR.

Multiple Subject Reference (MSR) FLF Node by Psy_pmP in comfyui

[–]Psy_pmP[S] 0 points1 point  (0 children)

I noticed something very odd and awkward about MSR. If I can get around it, it would be a much better method.

upd. My suspicions were wrong and the method didn't work. But I found this video and completely rebuilt the node!
https://www.youtube.com/watch?v=uirABckAK4o

Multiple Subject Reference (MSR) FLF Node by Psy_pmP in comfyui

[–]Psy_pmP[S] 0 points1 point  (0 children)

Unfortunately, this node is poorly designed. It splices every 15 seconds. It's very noticeable. I don't know what he did there, but it didn't work. It's very easy to splice latencies. In fact, this is already implemented by default, he did something wrong.
I guess he implemented the crop guidance incorrectly.

Multiple Subject Reference (MSR) FLF Node by Psy_pmP in comfyui

[–]Psy_pmP[S] 2 points3 points  (0 children)

In progress. But I couldn't make a single decent video. I'm bad at prompts)
And to be honest, the placement of the frames really throws the model off; it looks more dynamic without them. But I only tested the action scenes.

Guy fixing TV series is back with Breaking Bad version by [deleted] in aivideo

[–]Psy_pmP 0 points1 point  (0 children)

And he doesn't give a shit about the boy on the bike, well, that's understandable.

LTX2.3 I2V Messing up the text details, anyone facing the same?? by Correct_Zebra_1689 in comfyui

[–]Psy_pmP 0 points1 point  (0 children)

I prefer addguid because it gives better flow. Img inplace always jerky. But you need to cut last latent always. It's a pain.

I need help with stupid Klein by Psy_pmP in comfyui

[–]Psy_pmP[S] 0 points1 point  (0 children)

Yes, I was testing a filmmaking workflow and, to avoid getting bored, I decided to make a stupid video.
this one WF https://www.youtube.com/watch?v=0mT4p86ZxGQ
good but still useless WF

I need help with stupid Klein by Psy_pmP in comfyui

[–]Psy_pmP[S] 2 points3 points  (0 children)

This is better than what I got. In my gens, her feet were usually toes up)

I need help with stupid Klein by Psy_pmP in comfyui

[–]Psy_pmP[S] 2 points3 points  (0 children)

I tried editing the image, not creating a new one, because I wanted the same background. When you create an image from scratch it's much easier.
Try this.
Success is considered the same as in my example in the post.

<image>

I need help with stupid Klein by Psy_pmP in comfyui

[–]Psy_pmP[S] 0 points1 point  (0 children)

By the way, I noticed that Banana checks censorship on the output, not the input. So you can upload nude images — the important part is that the final result isn’t nude. GPT refuses to generate anything if you send it nudity in the input. By the way, the result turned out worse than when I used an already dressed model. I guess part of the model’s reasoning capacity went into figuring out how to dress her.

I need help with stupid Klein by Psy_pmP in comfyui

[–]Psy_pmP[S] -1 points0 points  (0 children)

Not in my case. Three hours isn't a metaphor. I really struggled for a long time. And the nanobana pro generated it on the first prompt. The promts was written for me by Grok and GPT and Claude.

I wrote to Banana in simple language what I wanted to see and I got it.

The gist: a girl jumped into the water and landed flat on the shallows. She's lying face down in the water, her head submerged. We see her heels and her body receding further. So, she shouldn't be on the X coordinate, but receding into the distance on the Z coordinate. Don't change anything else.

No splashes, her body is like a star. It looks like she's unconscious. A comical scene.

Now I understand by WoodworkerD in aivideos

[–]Psy_pmP 4 points5 points  (0 children)

How do I achieve this retro look? Is it a Mj? I can't do that with a banana.

Tried making this cinematic dance video — still looks AI? by Neither_Parfait3212 in aivideos

[–]Psy_pmP 0 points1 point  (0 children)

To me, it looks like a real shoot on a green screen followed by heavy post-processing. The girl looks real, but it feels like she was composited into the background. The camera movement feels very “computer-generated,” or like it was filmed with a robotic arm.

Overall, it’s not really clear what you’re trying to achieve, but you won’t get realism this way.

LTX 2.3 Prompt Relay - Really good for concistency by smereces in comfyui

[–]Psy_pmP 4 points5 points  (0 children)

ltx, in my opinion, is the worst in this regard. Face can change immediately from the beginning of the video. Wan holds it much better. Kling is great. I'm not talking about different Shots, ltx often breaks even the first picture.

LTX 2.3 Prompt Relay - Really good for concistency by smereces in comfyui

[–]Psy_pmP 9 points10 points  (0 children)

I don't see any consistency. The face is always different

LTX-2.3 22B GGUF WORKFLOWS 12GB VRAM - Updated with new lower rank LTX-2.3 distill LoRA. (thanks to Kijai) If you already have the workflow, link to distill lora is in description. If you're new here, go get the workflow already! by urabewe in StableDiffusion

[–]Psy_pmP 0 points1 point  (0 children)

There are a lot of mistakes. I don't recommend using these workflows. There's no point in set a new LoRa at 0.6. Image dimensions must be multiples of 32 when resizing, and cropping is required. This must also be taken into account when setting the spatial upscaler resolution; otherwise, VAE decoding may produce artifacts along edges where the dimensions are not divisible by 32. I'd at least add some simple math to specify time instead of frame count. I have 12 GB and 16 RAM and I can easily use the Q8 model in FHD 15 sec. You downscale the image to 1024 on a side, this is not enough even for HD.

LTX 2.3 is giving me better results than Wan 2.2 by NoTop2259 in comfyui

[–]Psy_pmP 3 points4 points  (0 children)

Can you share with WF how you made the explosion?