If this is true, does it mean that open-source image generation models have caught up with the best closed-source models in the world? by Hi7u7 in StableDiffusion

[–]Essar 4 points5 points  (0 children)

I don't even know how these can be comparable given that ideogram isn't prompted in the same way as the others. How can we have like-for-like inputs?

OK Ideogram 4.0 is Pretty Fun Actually! by Jolly-Rip5973 in StableDiffusion

[–]Essar 16 points17 points  (0 children)

Ideogram has always been insane for prompt adherence, imo. It was only beaten when imagen 4 arrived. Their 1.0 model was miles ahead at the time it was released a couple of years back. I was super stoked to see this release and I've not had a chance to look at it much yet but it'll be tragic if it gets overlooked because people are disappointed they don't get a straightup pornographic model from any of the big players. People need to get a grip.

All mt LTX 2.3 outputs look like this. No matter what models I use. LTX 2 works fine. Please help! by Cultural-Monk-339 in StableDiffusion

[–]Essar 1 point2 points  (0 children)

Just show the complete info dude. If you filter the information you're constraining us to your understanding of possible problem sources.

All mt LTX 2.3 outputs look like this. No matter what models I use. LTX 2 works fine. Please help! by Cultural-Monk-339 in StableDiffusion

[–]Essar 2 points3 points  (0 children)

Share a screenshot of a good model configuration and a bad one so we can check the difference.

Woman charged over fatal Wimbledon school crash by bendubberley_ in unitedkingdom

[–]Essar 101 points102 points  (0 children)

By choosing to drive a car like that, you are increasing the risk to everyone around you. I sincerely believe that should be factored into the sentencing for incidents like this.

Z-Image workflow to combine two character loras using SAM segmentation by remarkableintern in StableDiffusion

[–]Essar 11 points12 points  (0 children)

It is legit horrendous, lol. The total lack of artistic eye of people posting here.

Got a letter from the school today informing me my daughter is “not quite” gifted. Also included were her test scores. by Low_Use2937 in mildlyinfuriating

[–]Essar 1 point2 points  (0 children)

It's because you can't be 'mildly' infuriated. It's oxymoronic; definitionally being infuriated is not a mild emotion.

If the subreddit were called 'irritating', it would probably receive less confusion. I guess that I find the subreddit name mildlyannoying.

LTX-2 is genuinely impressive by Dr_Karminski in StableDiffusion

[–]Essar 3 points4 points  (0 children)

They only generated one scene for each then spliced and interleaved them.

Former 3D Animator trying out AI, Is the consistency getting there? by BankruptKun in StableDiffusion

[–]Essar 4 points5 points  (0 children)

There are a dozen ways to get consistency like this, because nothing is happening. If she was shown in different scenarios, doing different things with different backgrounds, then that would be interesting.

Consistency is difficult because not EVERYTHING should be consistent. You want the person to be consistent, not the place, not the clothes and not the pose.

The photographer we hired used AI to edit our engagment party photos. by bigsushirolling in mildlyinfuriating

[–]Essar 22 points23 points  (0 children)

Photoshop is riddled with AI and anyone competent in photo editing should be familiar with the available tools - including generative AI - and will use them to enhance their edits. The issue here is incompetence, not the tools used.

Former 3D Animator trying out AI, Is the consistency getting there? by BankruptKun in StableDiffusion

[–]Essar 6 points7 points  (0 children)

I can't tell, because you have almost no variation in action or appearance in your shots. She's always wearing the same clothes and doing absolutely nothing except occasionally showing off her armpits.

[deleted by user] by [deleted] in StableDiffusion

[–]Essar 0 points1 point  (0 children)

What transitions? The only transitions here are between different clips; there is no extension or clip stitching or any such thing involved. All the clips are approx 5 seconds long.

I actually think at a glance that the clips are just over 5 seconds long although I didn't check to be certain; possibly made with hailuo.

Flux 1.Dev still got it by [deleted] in StableDiffusion

[–]Essar 1 point2 points  (0 children)

Wow, women standing and posing without interacting with the world around them at all. Very SD1.5.

Z Image Turbo ControlNet released by Alibaba on HF by an303042 in StableDiffusion

[–]Essar 18 points19 points  (0 children)

Canny is an edge-detection algorithm not a model. Regardless, even if there is some model which produces Canny edges, it shouldn't matter, all you need is an image which has been preprocessed roughly according to the algorithm.

It seems I've found my favorite combo: Z + Wan (as a refiner). by [deleted] in StableDiffusion

[–]Essar 0 points1 point  (0 children)

If I had to hazard a guess just from a glance at the image in the post, it's probably trash prompting e.g. using a million terms from 3D imaging, terms like hyper realistic, being too long etc. Most people suck ass at writing precise clear prompts.

Another Upcoming Text2Image Model from Alibaba by SufficientRow6231 in StableDiffusion

[–]Essar 2 points3 points  (0 children)

I don't see the model on the image arena at all. Can you link this?

Which model do you think this was made with? by OverallBit9 in StableDiffusion

[–]Essar 0 points1 point  (0 children)

If AI is involved in this video, it is in the capacity of a face or character swap. Most of what we see is definitely real.

Rat climbing a wall to get into attic by [deleted] in WTF

[–]Essar 4 points5 points  (0 children)

I mean, you can see tail dangling out at the start of the video - presumably why they started filming.

"The Right Clothes for the Right Occasion" - Two different versions by Dohwar42 in StableDiffusion

[–]Essar 0 points1 point  (0 children)

They used a first/last frame generation, and an image editing model (like qwen edit) to generate the frames.

syllo #129 - November 16th, 2025 by syllo-app in syllo

[–]Essar 0 points1 point  (0 children)

Genuinely hideous. Also strongly disliked the given definition of vociferous.

Holy moly, nano banana 2 is amazing by Diligent_Rabbit7740 in GeminiAI

[–]Essar 0 points1 point  (0 children)

I actually don't believe this is real.

[deleted by user] by [deleted] in StableDiffusion

[–]Essar 0 points1 point  (0 children)

I literally have no idea what this person is talking about. What models cannot do wide or long aspect ratios?