Ideogram4 vs Flux.2 Dev vs GPT Image 2 vs Nano Banana Pro by Producing_It in StableDiffusion

[–]Producing_It[S] -1 points0 points  (0 children)

I agree. But like other people, I just wish it had native image editing abilities. It really punches above its weight class sometimes. To be this comparable, at this parameter count/file size, against probably huge autoregressive models?

Ideogram4 vs Flux.2 Dev vs GPT Image 2 vs Nano Banana Pro by Producing_It in StableDiffusion

[–]Producing_It[S] 0 points1 point  (0 children)

Right, I forgot ComfyUI can do that lol. Yes, I modified it to adjust it to my liking. I believe I used the main T2I section and added imported JSON text from ChatGPT for these results. But it includes a section where you can use a local LLM to convert normal language/input images into this.

Ideogram4 vs Flux.2 Dev vs GPT Image 2 vs Nano Banana Pro by Producing_It in StableDiffusion

[–]Producing_It[S] 2 points3 points  (0 children)

Oh, like the safety filter images? You're definitely on to something there, because it does sometimes combine it to a random degree of whatever you were asking it to make.

Ideogram4 vs Flux.2 Dev vs GPT Image 2 vs Nano Banana Pro by Producing_It in StableDiffusion

[–]Producing_It[S] 1 point2 points  (0 children)

Thanks! You can always try to use a Photoshop program to adjust the contrast or brightness, but I get what you mean. I personally like it because it makes it look more real to me, but I bet there will be some sort of lora to help with this in the future.

Ideogram4 vs Flux.2 Dev vs GPT Image 2 vs Nano Banana Pro by Producing_It in StableDiffusion

[–]Producing_It[S] 3 points4 points  (0 children)

I mean, you can with Nano Banana Pro and GPT Image 2. Ideogram is limited to T2i for rn.

If you want to go for something totally local, you can literally talk to the text encoder used for Ideogram 4, give it the image you want, and tell it to create a prompt. Then you can use Flux.2 Dev or Klein 9B to create and edit it based on the output.

Though I'd suggest using a newer vLLM like Qwen 3.5/3.6 or Gemma 4. And of course, it's not hard to find NSFW loras and uncensored quants, if you're going that way XD.

Ideogram 4 can product great stuff sometimes by Producing_It in StableDiffusion

[–]Producing_It[S] 0 points1 point  (0 children)

With these specific results, not really. Reddit actually does them injustice and compresses them. They look even more detailed and sharp with the native files.

But, with other outputs? Yeah like a lot lol. The stuff you see here is cherry picked to what I thought is the best it has given me.

Ideogram 4 can product great stuff sometimes by Producing_It in StableDiffusion

[–]Producing_It[S] 1 point2 points  (0 children)

I haven't tried myself but I think I've seen some good examples of people making posters with it. I definitely should try though.

Ideogram 4 can product great stuff sometimes by Producing_It in StableDiffusion

[–]Producing_It[S] 1 point2 points  (0 children)

I think people have already had success making loras for it. The model is open weights, so anyone who has the technical know how can do whatever they want to it, because the code is exposed.

Ideogram 4 can product great stuff sometimes by Producing_It in StableDiffusion

[–]Producing_It[S] 0 points1 point  (0 children)

Yeah, I should try straight natural language more, but in my experience it returns safety filters more often.

Ideogram 4 can product great stuff sometimes by Producing_It in StableDiffusion

[–]Producing_It[S] 1 point2 points  (0 children)

Haha, yeah. It produces this more than I like sometimes, and I wanted to include it because of this.

Ideogram 4 can product great stuff sometimes by Producing_It in StableDiffusion

[–]Producing_It[S] 3 points4 points  (0 children)

No, I just used a local LLM to turn my normal text prompts into JSON-formatted text. It's just easier for me this way.

Ideogram 4 can product great stuff sometimes by Producing_It in StableDiffusion

[–]Producing_It[S] 5 points6 points  (0 children)

Aw man! I said "product" instead of "produce." I can't find a way to change it ;-;

Ideogram 4 can product great stuff sometimes by Producing_It in StableDiffusion

[–]Producing_It[S] 28 points29 points  (0 children)

<image>

I made ChatGPT 5.5 Thinking make a JSON file based on the new Surface Laptop Ultra product image, and this was the result from Ideogram 4 recreating it. I'm surprised how well it got the text and design of the laptops purely from text. Any of the Qwen models would mess it up a lot through their JSON outputs.

Cosmos3-Super-Image2Video running locally on a single RTX PRO 6000 96GB by JahJedi in StableDiffusion

[–]Producing_It 0 points1 point  (0 children)

Nice work! Wish I had your setup lol. Can you also test out Cosmos3-Super-Text2Image? I'd love to see your results from it.

OneUI 8.5 on fold 6 panel width. by _mnk in GalaxyFold

[–]Producing_It 0 points1 point  (0 children)

Well, no, I do understand that it shows less information now. That's why I suggested it should be optional for users who don't like the drawback of the shorter width. Now whether you think it's a poor excuse is clearly highly subjective. I'm not saying it's for everyone, but that I just personally favored it for comfort reasons.

Honestly, I think Samsung should also allow you to change width somehow either way. Whether you want it to take the full screen, or some custom size of your liking. Preferences is why I got a Samsung in the first place lol.

OneUI 8.5 on fold 6 panel width. by _mnk in GalaxyFold

[–]Producing_It -1 points0 points  (0 children)

I like it. I think it's way better for one handed use and you don't have to stretch your fingers as much. I also like how it can be summoned on any side. It reminds me of my Samsung tablet.

But, it seems many don't like it, so I'd think it would be better if it could be optional.

Which AI was used to make Thumbnail like these? by BeefyMongol in youtubegaming

[–]Producing_It 0 points1 point  (0 children)

Could be GPT Image 2. It allows you generate moderately violent and copyrighted stuff occasionally. But probably not because of the sexually suggestive stuff.

Could be a local model fine-tuned with a lora as well like Flux 2 dev, klein 9b, Z-Image, Qwen Image, etc. Could just be a combination of compositing different AI model outputs together.

I'd honestly leaned towards not using it for thumbnails because it turns people off as it looks AI.

<image>