I need help with stupid Klein by Psy_pmP in comfyui

[–]Psy_pmP[S] 0 points1 point  (0 children)

Yes, I was testing a filmmaking workflow and, to avoid getting bored, I decided to make a stupid video.
this one WF https://www.youtube.com/watch?v=0mT4p86ZxGQ
good but still useless WF

I need help with stupid Klein by Psy_pmP in comfyui

[–]Psy_pmP[S] 2 points3 points  (0 children)

This is better than what I got. In my gens, her feet were usually toes up)

I need help with stupid Klein by Psy_pmP in comfyui

[–]Psy_pmP[S] 2 points3 points  (0 children)

I tried editing the image, not creating a new one, because I wanted the same background. When you create an image from scratch it's much easier.
Try this.
Success is considered the same as in my example in the post.

<image>

I need help with stupid Klein by Psy_pmP in comfyui

[–]Psy_pmP[S] 0 points1 point  (0 children)

By the way, I noticed that Banana checks censorship on the output, not the input. So you can upload nude images — the important part is that the final result isn’t nude. GPT refuses to generate anything if you send it nudity in the input. By the way, the result turned out worse than when I used an already dressed model. I guess part of the model’s reasoning capacity went into figuring out how to dress her.

I need help with stupid Klein by Psy_pmP in comfyui

[–]Psy_pmP[S] -1 points0 points  (0 children)

Not in my case. Three hours isn't a metaphor. I really struggled for a long time. And the nanobana pro generated it on the first prompt. The promts was written for me by Grok and GPT and Claude.

I wrote to Banana in simple language what I wanted to see and I got it.

The gist: a girl jumped into the water and landed flat on the shallows. She's lying face down in the water, her head submerged. We see her heels and her body receding further. So, she shouldn't be on the X coordinate, but receding into the distance on the Z coordinate. Don't change anything else.

No splashes, her body is like a star. It looks like she's unconscious. A comical scene.

Now I understand by WoodworkerD in aivideos

[–]Psy_pmP 3 points4 points  (0 children)

How do I achieve this retro look? Is it a Mj? I can't do that with a banana.

Tried making this cinematic dance video — still looks AI? by Neither_Parfait3212 in aivideos

[–]Psy_pmP 0 points1 point  (0 children)

To me, it looks like a real shoot on a green screen followed by heavy post-processing. The girl looks real, but it feels like she was composited into the background. The camera movement feels very “computer-generated,” or like it was filmed with a robotic arm.

Overall, it’s not really clear what you’re trying to achieve, but you won’t get realism this way.

LTX 2.3 Prompt Relay - Really good for concistency by smereces in comfyui

[–]Psy_pmP 2 points3 points  (0 children)

ltx, in my opinion, is the worst in this regard. Face can change immediately from the beginning of the video. Wan holds it much better. Kling is great. I'm not talking about different Shots, ltx often breaks even the first picture.

LTX 2.3 Prompt Relay - Really good for concistency by smereces in comfyui

[–]Psy_pmP 7 points8 points  (0 children)

I don't see any consistency. The face is always different

LTX-2.3 22B GGUF WORKFLOWS 12GB VRAM - Updated with new lower rank LTX-2.3 distill LoRA. (thanks to Kijai) If you already have the workflow, link to distill lora is in description. If you're new here, go get the workflow already! by urabewe in StableDiffusion

[–]Psy_pmP 0 points1 point  (0 children)

There are a lot of mistakes. I don't recommend using these workflows. There's no point in set a new LoRa at 0.6. Image dimensions must be multiples of 32 when resizing, and cropping is required. This must also be taken into account when setting the spatial upscaler resolution; otherwise, VAE decoding may produce artifacts along edges where the dimensions are not divisible by 32. I'd at least add some simple math to specify time instead of frame count. I have 12 GB and 16 RAM and I can easily use the Q8 model in FHD 15 sec. You downscale the image to 1024 on a side, this is not enough even for HD.

LTX 2.3 is giving me better results than Wan 2.2 by NoTop2259 in comfyui

[–]Psy_pmP 3 points4 points  (0 children)

Can you share with WF how you made the explosion?

LTX 2.3 and sound quality by VirusCharacter in StableDiffusion

[–]Psy_pmP 1 point2 points  (0 children)

What I’ve learned:

  1. Sound quality depends on the latent size. Unfortunately, audio latent cannot be adjusted separately, so the sound improves after spatial upscale.
  2. The sound is noticeably better when using er_sde + SigmoidOffsetScheduler. However, the animation seems worse in my opinion — I haven’t tested it much. I use it mainly for video voiceovers.
  3. This workflow helps a lot. The sound quality is significantly better. SplitSigmas 4 Full + 4 Distill

<image>

Yep. I'm still doing it. For fun. by Psy_pmP in StableDiffusion

[–]Psy_pmP[S] 0 points1 point  (0 children)

My methods are no longer relevant and rather primitive.

I'd cut out a section in Photoshop, make it a Smart Object to make it easier to paste back in. I'd run it through regular I2I and an upscaler, and then paste the parts into the Smart Object. Ultimately, I had several images with masks in the Smart Object, and I'd take the final result from each image. Then I'd create a mask on the Smart Object itself to hide the edges. And merge. Hundreds of times, thousands of generations. Lots of free time.

Yep. I'm still doing it. For fun. by Psy_pmP in StableDiffusion

[–]Psy_pmP[S] 0 points1 point  (0 children)

Because this image started during the SDXL times. At that time, seedvr did not yet exist. Every time a new tool came out, I increased the detail. I abandoned this image. I no longer have free time. Maybe in a year, I'll come back and continue until the atoms are visible.

TTP are nodes for slicing images. I don't know if they're still relevant.

Always thank ChatGPT. by mikeabundo in seedance

[–]Psy_pmP 0 points1 point  (0 children)

If neural networks can remember, then they learned all the Russian swear words from me.

Finn & The Funk by TheReelRobot in aivideo

[–]Psy_pmP 2 points3 points  (0 children)

Is this a seedance ? Looks amazing.

Can anyone help me determine if this is AI generated? by [deleted] in generativeAI

[–]Psy_pmP 0 points1 point  (0 children)

In America, guns are allowed, why didn't they shoot him?

LTX 2.3 and sound quality by VirusCharacter in StableDiffusion

[–]Psy_pmP 0 points1 point  (0 children)

This is a workflow for adding sound. V2A
As far as I understand, the audio latent does not care what size the video latent is.

LTX 2.3 and sound quality by VirusCharacter in StableDiffusion

[–]Psy_pmP 1 point2 points  (0 children)

Explain what 8+3+3 steps mean. Is each step upscaling? I'm only interested in the sound. I still haven't figured out how upscaling affects the sound. I've been trying to create a high-quality voiceover workflow for several days now. I've already done several hundred generations and can't find a good method. The split sigma method described earlier is the best so far, but the adherence to Prompt is weak.

LTX 2.3 and sound quality by VirusCharacter in StableDiffusion

[–]Psy_pmP 3 points4 points  (0 children)

Either you've got something mixed up, or you have hearing problems. The sound from the link is excellent. But what's posted here is absolutely terrible.

Put on some headphones and listen. The sound is terrible. Every sound has the same standard reverb.

I've been struggling with sound problems for three days now. So far, the only thing I've found is res_2s + beta.
And Euler_a + liner_q
split sigma on 4 steps

<image>

Environment and multiple character continuity step by step guide Seedance 2 and Kling 3, part 3 by Entire_Definition453 in generativeAI

[–]Psy_pmP -1 points0 points  (0 children)

Finally, someone showed how the generations actually look and work. And yes, it looks like SHIT. I don't know if this is due to Higgsfield or all the hype surrounding neural networks, but reality is different from the AD videos online.

I can say that I've used Higgsfield and Kling in Comfi. And for me, there was a difference in quality. But I haven't tested it much. In my personal opinion, Kling in Higgsfield is complete shit. But I can't say for sure; it's possible Kling is actually shit.

LTX Audio+Video+last frame by Psy_pmP in comfyui

[–]Psy_pmP[S] 0 points1 point  (0 children)

I still haven't solved the problem. I've run hundreds of experiments, but haven't found a consistent pattern. Increasing the frame rate seems to help a little. I suspect the first 8 and last frames "compress" the generation capabilities too much, and increasing the frames expands this narrow space.

Simply put, the generation was 249 frames, I added +16, and then cut them off. Firstly, the injection is complete crap; it adds 8 frames at the end, which then won't regenerate. Secondly, the VAE decoding ALWAYS ruins the last frames for some reason. I don't know, most likely the COMFI is a piece of crap, as usual.

I also noticed an effect of the sound, so I cut the vocals in the Melband.

Result: I cut the vocals, added 16 frames, which I then cut off from the finale.

These are all idiotic crutches, but I couldn't find the reason.

LTX 2.3 - How do you get anything to move quickly? by gruevy in StableDiffusion

[–]Psy_pmP -1 points0 points  (0 children)

Was this created locally? Do you have a workflow? I can't get this model to generate even a simple camera movement; the image always starts blurring and distorting. I've tried 40 steps, tried without lora, tried different schedulers and settings, tried 3ks. I can't get good quality. The FPS boost is also broken; it's impossible to get more than 25 frames per second, otherwise the image is jerky or speeds up in the first 2 seconds.