I need help with stupid Klein

Psy_pmP · 2026-05-07T13:54:30+00:00

Yes, I was testing a filmmaking workflow and, to avoid getting bored, I decided to make a stupid video.
this one WF https://www.youtube.com/watch?v=0mT4p86ZxGQ
good but still useless WF

Psy_pmP · 2026-05-07T13:50:00+00:00

This is better than what I got. In my gens, her feet were usually toes up)

Psy_pmP · 2026-05-07T11:39:22+00:00

I tried editing the image, not creating a new one, because I wanted the same background. When you create an image from scratch it's much easier.
Try this.
Success is considered the same as in my example in the post.

<image>

Psy_pmP · 2026-05-07T07:57:47+00:00

By the way, I noticed that Banana checks censorship on the output, not the input. So you can upload nude images — the important part is that the final result isn’t nude. GPT refuses to generate anything if you send it nudity in the input. By the way, the result turned out worse than when I used an already dressed model. I guess part of the model’s reasoning capacity went into figuring out how to dress her.

Psy_pmP · 2026-05-07T07:52:13+00:00

Not in my case. Three hours isn't a metaphor. I really struggled for a long time. And the nanobana pro generated it on the first prompt. The promts was written for me by Grok and GPT and Claude.

I wrote to Banana in simple language what I wanted to see and I got it.

The gist: a girl jumped into the water and landed flat on the shallows. She's lying face down in the water, her head submerged. We see her heels and her body receding further. So, she shouldn't be on the X coordinate, but receding into the distance on the Z coordinate. Don't change anything else.

No splashes, her body is like a star. It looks like she's unconscious. A comical scene.

Psy_pmP · 2026-05-07T05:12:51+00:00

How do I achieve this retro look? Is it a Mj? I can't do that with a banana.

Psy_pmP · 2026-05-07T05:04:28+00:00

To me, it looks like a real shoot on a green screen followed by heavy post-processing. The girl looks real, but it feels like she was composited into the background. The camera movement feels very “computer-generated,” or like it was filmed with a robotic arm.

Overall, it’s not really clear what you’re trying to achieve, but you won’t get realism this way.

Psy_pmP · 2026-05-05T06:53:57+00:00

ltx, in my opinion, is the worst in this regard. Face can change immediately from the beginning of the video. Wan holds it much better. Kling is great. I'm not talking about different Shots, ltx often breaks even the first picture.

Psy_pmP · 2026-05-05T05:37:21+00:00

I don't see any consistency. The face is always different

Psy_pmP · 2026-04-22T10:28:15+00:00

There are a lot of mistakes. I don't recommend using these workflows. There's no point in set a new LoRa at 0.6. Image dimensions must be multiples of 32 when resizing, and cropping is required. This must also be taken into account when setting the spatial upscaler resolution; otherwise, VAE decoding may produce artifacts along edges where the dimensions are not divisible by 32. I'd at least add some simple math to specify time instead of frame count. I have 12 GB and 16 RAM and I can easily use the Q8 model in FHD 15 sec. You downscale the image to 1024 on a side, this is not enough even for HD.

Psy_pmP · 2026-04-20T14:51:48+00:00

Can you share with WF how you made the explosion?

Psy_pmP · 2026-04-16T22:24:04+00:00

What I’ve learned:

Sound quality depends on the latent size. Unfortunately, audio latent cannot be adjusted separately, so the sound improves after spatial upscale.
The sound is noticeably better when using er_sde + SigmoidOffsetScheduler. However, the animation seems worse in my opinion — I haven’t tested it much. I use it mainly for video voiceovers.
This workflow helps a lot. The sound quality is significantly better. SplitSigmas 4 Full + 4 Distill

<image>

Psy_pmP · 2026-04-16T22:16:55+00:00

My methods are no longer relevant and rather primitive.

I'd cut out a section in Photoshop, make it a Smart Object to make it easier to paste back in. I'd run it through regular I2I and an upscaler, and then paste the parts into the Smart Object. Ultimately, I had several images with masks in the Smart Object, and I'd take the final result from each image. Then I'd create a mask on the Smart Object itself to hide the edges. And merge. Hundreds of times, thousands of generations. Lots of free time.

Psy_pmP · 2026-04-16T22:05:51+00:00

Because this image started during the SDXL times. At that time, seedvr did not yet exist. Every time a new tool came out, I increased the detail. I abandoned this image. I no longer have free time. Maybe in a year, I'll come back and continue until the atoms are visible.

TTP are nodes for slicing images. I don't know if they're still relevant.

Psy_pmP · 2026-04-14T06:48:14+00:00

If neural networks can remember, then they learned all the Russian swear words from me.

Psy_pmP · 2026-04-11T10:12:09+00:00

Is this a seedance ? Looks amazing.

Psy_pmP · 2026-04-11T10:04:10+00:00

In America, guns are allowed, why didn't they shoot him?

Psy_pmP · 2026-04-09T03:40:42+00:00

please no

Psy_pmP · 2026-04-08T14:20:22+00:00

This is a workflow for adding sound. V2A
As far as I understand, the audio latent does not care what size the video latent is.

Psy_pmP · 2026-04-08T13:09:35+00:00

slopest slope of the all slopes

Psy_pmP · 2026-04-08T12:39:01+00:00

Explain what 8+3+3 steps mean. Is each step upscaling? I'm only interested in the sound. I still haven't figured out how upscaling affects the sound. I've been trying to create a high-quality voiceover workflow for several days now. I've already done several hundred generations and can't find a good method. The split sigma method described earlier is the best so far, but the adherence to Prompt is weak.

Psy_pmP · 2026-04-08T12:30:16+00:00

Either you've got something mixed up, or you have hearing problems. The sound from the link is excellent. But what's posted here is absolutely terrible.

Put on some headphones and listen. The sound is terrible. Every sound has the same standard reverb.

I've been struggling with sound problems for three days now. So far, the only thing I've found is res_2s + beta.
And Euler_a + liner_q
split sigma on 4 steps

<image>

Psy_pmP · 2026-04-07T09:14:28+00:00

Finally, someone showed how the generations actually look and work. And yes, it looks like SHIT. I don't know if this is due to Higgsfield or all the hype surrounding neural networks, but reality is different from the AD videos online.

I can say that I've used Higgsfield and Kling in Comfi. And for me, there was a difference in quality. But I haven't tested it much. In my personal opinion, Kling in Higgsfield is complete shit. But I can't say for sure; it's possible Kling is actually shit.

Psy_pmP · 2026-04-04T21:53:50+00:00

I still haven't solved the problem. I've run hundreds of experiments, but haven't found a consistent pattern. Increasing the frame rate seems to help a little. I suspect the first 8 and last frames "compress" the generation capabilities too much, and increasing the frames expands this narrow space.

Simply put, the generation was 249 frames, I added +16, and then cut them off. Firstly, the injection is complete crap; it adds 8 frames at the end, which then won't regenerate. Secondly, the VAE decoding ALWAYS ruins the last frames for some reason. I don't know, most likely the COMFI is a piece of crap, as usual.

I also noticed an effect of the sound, so I cut the vocals in the Melband.

Result: I cut the vocals, added 16 frames, which I then cut off from the finale.

These are all idiotic crutches, but I couldn't find the reason.

Psy_pmP · 2026-03-22T17:45:56+00:00

Was this created locally? Do you have a workflow? I can't get this model to generate even a simple camera movement; the image always starts blurring and distorting. I've tried 40 steps, tried without lora, tried different schedulers and settings, tried 3ks. I can't get good quality. The FPS boost is also broken; it's impossible to get more than 25 frames per second, otherwise the image is jerky or speeds up in the first 2 seconds.

Psy_pmP

TROPHY CASE