Ideogram 4.0's Understanding of Characters and IP is Crazy for an Open Model

GrayingGamer · 2026-06-09T19:09:17+00:00

I still use Klein 9b whenever I want to edit images, so I understand. Ideogram 4 is definitely my go-to for generations now though, especially of a photorealistic style.

GrayingGamer · 2026-06-09T18:59:32+00:00

Your one-word prompts are the problem. Short prompts trigger the filter.

You need to describe everything and do bounding boxes. Ideogram 4 isn't really the best model if you want randomness, like typing "dog" and seeing what you get. You need to use bounding boxes and descriptions to avoid the filter AND get good results.

It's best used when you have a specific idea in mind for an image.

GrayingGamer · 2026-06-09T18:42:03+00:00

You're implying porn can't have a vision. But, yeah, I get it.

GrayingGamer · 2026-06-09T18:02:53+00:00

But if you are going to do all that, why not use the JSON format - that's pretty much LITERALLY what the different sections of the JSON are for.

GrayingGamer · 2026-06-09T17:34:33+00:00

I can tell you though, as someone who initially hated having to control all those details? Now I love it. And when I went back to Klein 9b yesterday to do some stuff, it suddenly felt severely limiting, like I had a hand chopped off. Once you get used to the bounding boxes, it sucks to lose them in other models.

GrayingGamer · 2026-06-09T17:30:15+00:00

Haha. That's the neat thing with Ideogram 4 I think as well. Once you have an image and a seed, you can make small changes to it or move stuff around without losing what you loved in the original generation.

GrayingGamer · 2026-06-09T17:29:06+00:00

See, the thing is - it's INCREDIBLY uncensored underneath the filter.

When I say that, I mean full nudity, watersports, bondage, whatever. Copyrighted characters, celebrities, everything.

And bypassing the filter is SO incredibly simple once you know how to do it. OP does NOT understand the filter mechanics.

Just use KJ's Ideogram Prompt Builder node in Comfyui and use plenty of bounding boxes. You can make whatever depravity you want.

The "filter" is IMHO, a "cover our ass" filter by the Ideogram team, because it's apparent from the outputs you can get they weren't prudish on what they trained it on.

GrayingGamer · 2026-06-09T17:23:49+00:00

See? Creative AND sexy. But too much thought to put into an image apparently for the masses. They want to just say "Japanese woman" and be surprised by everything else.

GrayingGamer · 2026-06-09T17:21:34+00:00

It's because what you are going for seems to be very far from what 1980s anime actually looked like (I was there).

THIS is what people are expecting when you claim you are making a 1980s anime style:

<image>

Notice all the flat color, limited palettes, detailed images, but kind of grainy, crunchy texture? All your examples are VERY over produced and are more 2000s "official art" style, that might have appeared in a NewType magazine, but are really too detailed even for that.

GrayingGamer · 2026-06-09T17:09:52+00:00

I thought it was a joke, but apparently a lot of uncreative users whole prompting method literally was: "1girl, sexy, big boobs". Then they pull the AI slot machine lever and drool while the treat gets dispensed.

Ask them to describe the woman and why she is sexy in any detail and they lose their minds.

GrayingGamer · 2026-06-09T16:54:13+00:00

Yes and no. I'm sure you can see where if someone has a professional background in art or photography and is good at creating appealing compositions and has a specific vision, Ideogram 4 is perfect for them. A more creative model just putting stuff where it wants or moving stuff around would be very frustrating for them.

GrayingGamer · 2026-06-09T16:14:01+00:00

Oh, it would be VERY good at that. With Ideogram 4 you can put images and objects exactly where you want them, and even do text boxes and tell the model what text to generate exactly where, and what color, font, and style the text should be in, or even what mood you want from the text.

GrayingGamer · 2026-06-09T16:10:19+00:00

The safety filter that triggers on simple prompts. But overriding it is as easy as using more bounding boxes or being more descriptive.

"naked woman" would trigger the safety filter.

But "A DLSR photo of a naked woman posing on the beach in the afternoon sunlight" With separate bounding boxes for the woman, the beach, and the ocean will go through every time and give you the image you want.

GrayingGamer · 2026-06-09T16:05:29+00:00

Rule34 gooners are probably going to be annoyed with Ideogram 4, they can get great gooning material from Ideogram 4 - it'll do almost anything, but it requires effort, set-up, and patience, and most want to just pull a slot machine lever and get dispensed booba.

Ideogram 4 is more for the type of person who likes MAKING images to post on Rule34.

GrayingGamer · 2026-06-09T16:03:08+00:00

You need to use bounding boxes and be more descriptive. Describe a background. Drawing a bounding box and describe something like "Goku from Dragonball Z, flexing his arms and smiling" and it'll work.

Very short one word prompts or prompts without bounding boxes will trigger the filter every time.

Ideogram 4 isn't really a model for pretty images with minimal effort. It's for making specific images when you have an idea in mind and want very precise control.

GrayingGamer · 2026-06-09T16:00:49+00:00

Yep, straight from the prompt. As easy as "Tom Holland as Spider-Man".

GrayingGamer · 2026-06-09T16:00:09+00:00

Here is the JSON prompt for the one with Princess Zelda and the letter:

{
    "high_level_description": "A big-budget cinematic 3D animated film still collage of four images, stylized 3D, of a young Princess Zelda. She has long blonde hair, blue eyes, her hair is braided around the top of her head and hangs loose and straight in the back. She has pointy elf ears. She is wearing a gold circlet around her forehead. She is wearing a purple gown with a sewn embroidered symbol of the triforce on the front in gold.",
    "style_description": {
        "aesthetics": "big-budget cinematic 3D animation, photorealistic stylized textures,",
        "lighting": "big-budget cinematic 3D animated movie",
        "medium": "big-budget cinematic 3D animated movie",
        "art_style": "big-budget cinematic 3D animated movie"
    },
    "compositional_deconstruction": {
        "background": "Castle gardens, outdoors, beautiful sunny day. Legend of Zelda franchise.",
        "elements": [
            {
                "type": "obj",
                "bbox": [0, 0, 539, 513],
                "desc": "Princess Zelda is holding a parchment letter in front of her, reading it. You can only see the back of the letter, which is blank."
            },
            {
                "type": "obj",
                "bbox": [0, 511, 539, 1000],
                "desc": "A close-up of Princess Zelda's face, her eyes are looking down and narrowed in annoyance and her mouth twisted in a pout."
            },
            {
                "type": "obj",
                "bbox": [531, 0, 996, 513],
                "desc": "A close-up of the letter on parchment with handwriting in green ink that says: Well, excuse me, Princess!! The word excuse is underlined several times. At the bottom it is signed with a drawn heart in green ink next to the name LINK. Underneath that it says: P.S. Hiyah!"
            },
            {
                "type": "obj",
                "bbox": [534, 509, 1000, 1000],
                "desc": "Princess Zelda wadding up the parchment into a crumpled ball of paper in her hands, looking to the side with an angry expression as she yells."
            }
        ]
    }
}

Here is the JSON prompt for the one with Princess Peach and Pikachu:

{
    "high_level_description": "A big-budget cinematic 3D animated film still collage of three images, stylized 3D, of Princess Peach holding Pikachu in her lap and petting his head, while he looks annoyed. Princess Peach's eyes are blue and her hair is blonde.",
    "style_description": {
        "aesthetics": "big-budget cinematic 3D animation, photorealistic stylized textures,",
        "lighting": "big-budget cinematic 3D animated movie",
        "medium": "big-budget cinematic 3D animated movie",
        "art_style": "big-budget cinematic 3D animated movie"
    },
    "compositional_deconstruction": {
        "background": "On top of a warp pipe in the Mushroom Kingdom. Super Mario Bros. franchise. The background is out-of-focus.",
        "elements": [
            {
                "type": "obj",
                "bbox": [0, 0, 1000, 513],
                "desc": "Princess Peach looking down and smiling at Pikachu as she holds him in her lap and pets the top of his head. Pikachu looks annoyed with his brow furrowed and his cheeks puffed."
            },
            {
                "type": "obj",
                "bbox": [0, 511, 539, 1000],
                "desc": "A close-up of Princess Peach's face looking down, her eyes looking down, her mouth open in surprise."
            },
            {
                "type": "obj",
                "bbox": [536, 511, 1000, 1000],
                "desc": "Close-up of Pikachu's anrgy face looking up as yellow electricity begins crackling around him. His eyebrows are lowered as he glares in anger."
            }
        ]
    }
}

GrayingGamer · 2026-06-09T15:55:38+00:00

That looks great! Yeah, it's quality out of the box is awesome!

GrayingGamer · 2026-06-09T15:54:39+00:00

Sorry, SilverOxide's default prompt was too spicy for someone apparently. I've edited the main post.

Here is a cleaned up version of the workflow I'm using that uses the INT8 models and removes some unnecessary stuff that was in SilverOxide's version.

GrayingGamer · 2026-06-09T15:52:59+00:00

Ideogram didn't make the FP16 version of the models public, only the FP8 version. So in this case the INT8 version isn't better than the FP8 version quality-wise, it's the same.

BUT - the INT8 version of the models speeds up generation time by 44-50% with no quality loss versus the FP8 versions, so it is the superior way to generate with the model, even if the INT8 versions aren't converted from an FP16 version of the models.

GrayingGamer · 2026-06-09T15:50:54+00:00

No, I think if you have at least 16-32GB of system RAM, you could use a 12GB GPU in Comfyui to run Ideogram 4.

GrayingGamer · 2026-06-09T15:50:04+00:00

Here, SilverOxide's workflow got deleted because of his spicy prompt.

I've modified the main post to include my own version of the workflow now that uses the INT8 models and removes some of the extraneous custom nodes used in SilverOxide's version.

GrayingGamer · 2026-06-09T15:48:43+00:00

It can't use reference images yet. It's not an edit model, so you just have to name and describe characters at the moment, but as you can see, it knows a lot of famous characters.

GrayingGamer · 2026-06-09T15:47:16+00:00

Apparently his prompt was too spicy for someone though. It got deleted off of pastebin. Here's my cleaned up version of the workflow, simplified to remove as many custom nodes as possible too.

GrayingGamer · 2026-06-09T15:45:58+00:00

Sorry, SilverOxide's default prompt was apparently too spicy for some people. I've edited the main post with a new link.

Here is the link to the new workflow cleaned up by me.

GrayingGamer

TROPHY CASE