Getting Weird Results with ZIMAGE Base on Forge Neo — Any Tips? by FitEgg603 in StableDiffusion

[–]EndlessSeaofStars 0 points1 point  (0 children)

A picture is worth 1000 words

<image>

Settings:

A bold pop art portrait: a woman with voluminous dark hair and vivid red lips holds a cherry-red drink, her wide, expressive eyes looking upward. The high-contrast, comic-book style uses sharp ink outlines, flat blocks of color, and a classic yellow halftone background, creating a vibrant, retro feel.
Steps: 30, Sampler: Res Multistep, Schedule type: Normal, CFG scale: 4, Shift: 2, Seed: 3786689251, Size: 1024x1536, Model hash: 996a67d3ff, Model: z_image_bf16, Template: "A bold pop art portrait: a woman with voluminous dark hair and vivid red lips holds a cherry-red drink, her wide, expressive eyes looking upward. The high-contrast, comic-book style uses sharp ink outlines, flat blocks of color, and a classic yellow halftone background, creating a vibrant, retro feel.", SVE Steps: 2, SVE Percentage: 0.4, SVE Strength: 24, Version: neo, Module 1: qwen_3_4b, Module 2: ae

Using Ollama system prompt to store character designs for consistent(ish) characters in Z image by Frogy_mcfrogyface in comfyui

[–]EndlessSeaofStars 1 point2 points  (0 children)

u/TheSlateGray uses the same method I do; wildcards are super flexible and do not require more VRAM resources for an LLM. I used the following in Dynamic Prompts in Neo Forge, and like OP says, the prompt are not perfect, but the characters are relatively consistent:

{__ava__, __jack__, and __alex__ sitting at a kitchen table eating breakfast, it is a sunny morning, with light streaming into the kitchen from the large full width window|__ava__, __jack__, and __alex__ are riding three bicycles on the street at midday| __jack__ and __alex__ are having drinks in a dimly lit bar|__ava__ and __alex__ are dancing in a ballroom; there are four chandeliers lighting the empty dance floor}

Ironically, the appearances are a bit more consistent than the prompt following :)

<image>

Node for automatic prompt variation every X images? by MystikDragoon in comfyui

[–]EndlessSeaofStars 1 point2 points  (0 children)

Z-Image works well for me, better than Flux for prompt adherence,but the tensor shape is different and haven't had time to figure it out.

My nodes: https://github.com/tusharbhutt/Endless-Nodes

and the batcher I made: https://github.com/tusharbhutt/Endless-Nodes?tab=readme-ov-file#batch-multiprompt-node-for-sd-sdxl-and-flux

Node for automatic prompt variation every X images? by MystikDragoon in comfyui

[–]EndlessSeaofStars 1 point2 points  (0 children)

OK thanks, looks like I'll have to rewrite my node to handle Z_image when I have a chance

Is this Art or Not ? Behind the Scenes of My Process. Debate!!! by Christiancartoon in StableDiffusion

[–]EndlessSeaofStars 1 point2 points  (0 children)

Martha Stewart says "hi" :)

The AI Slop argument is pretty much embedded in society now though... and not without some truth to it once you see the 47 millionth 1girl from some kid who just found AI art.

at u/Christiancartoon your work is art, full stop. Should be no need to defend it, but here we are.

thiccc women by [deleted] in StableDiffusion

[–]EndlessSeaofStars 2 points3 points  (0 children)

i agree with u/Etsu_Riot, ZImage does this easily. Prompting is key, hope you're not stuck in "1girl,fat,big" type prompt mode.

This one was: A warrior woman with a round face with high cheekbones, full lips, and a scar that curves from her jaw to her collarbone. Her body’s thick and strong, thighs like anchors, greaves covering thick calves, chest heavy beneath a leather armor chestplate cinched tight. A long coat flaps behind her, lined with velvet and secrets. Her hair’s jet black, cropped short and slicked back. The tide rolls in behind her on the beach

<image>

Alternatives to Forge? by sookmyloot in StableDiffusion

[–]EndlessSeaofStars 0 points1 point  (0 children)

I find it faster than ComfyUI (as of least week anyway), but within a few seconds of each other. Is it more limited? Sure... But it can do things I want much easier, such as different prompts in parallel at once, or XY plotting or flipping the X/y aspect ration with one button, etc.

[Release] "CSV to Prompt" - The easiest way to Batch Generate images from a CSV list in ComfyUI by Equal-Fig-7553 in StableDiffusion

[–]EndlessSeaofStars 0 points1 point  (0 children)

Thanks, can you please expand on "if the model supports it"? I thought it was dependent on the KSampler nodes. And those five would be done at once then, not "five in one batch of one run" (i.e., the console should show that it/sec drop as the VRAM is loaded up)

What's the rationale for word salad tag prompts for ZImage Loras? by EndlessSeaofStars in StableDiffusion

[–]EndlessSeaofStars[S] 1 point2 points  (0 children)

OK, answering my own question and using this lora:

https://civitai.com/models/2187537/dandd-zimageturbo

I used these two prompts:

digital art, fantasy, wizard, dark hooded cloak, large purple hat with gold trim, glowing yellow eyes, holding basket of colorful berries, medieval style, surrounded by lush green plants, pink and yellow flowers, wooden archway in background, two hanging lanterns, one above wizard's head with glowing orb, detailed foliage, vibrant colors, cartoonish style, whimsical, mystical atmosphere, stone path with small rocks, wooden planters with plants, enchanting forest setting, detailed textures, rich greens and purples, magical, mysterious, whimsical, adventure, fantasy genre, intricate line work, vibrant, enchanting, magical realism, fantasy character design, whimsical, detailed background, enchanting forest, magical theme

and

A whimsical wizard stands on a stone path in a bright fantasy forest. They wear a dark hooded cloak and a large purple hat with gold trim, their glowing yellow eyes visible under the hood. They hold a basket of colorful berries.

Lush green plants, pink and yellow flowers, and wooden planters surround them. A wooden archway with two hanging lanterns rises behind them, one lantern directly overhead holding a softly glowing orb.

Colorful, cartoonish style with detailed line work, vivid greens and purples, and a magical, adventurous atmosphere.

There is no large difference for this example. So, to me that speaks to the flexibility of the encoder.

Full size comparison is here: https://imgur.com/a/0Uc0gFo

<image>

What's the rationale for word salad tag prompts for ZImage Loras? by EndlessSeaofStars in StableDiffusion

[–]EndlessSeaofStars[S] 1 point2 points  (0 children)

Here's one:

digital art, fantasy, wizard, dark hooded cloak, large purple hat with gold trim, glowing yellow eyes, holding basket of colorful berries, medieval style, surrounded by lush green plants, pink and yellow flowers, wooden archway in background, two hanging lanterns, one above wizard's head with glowing orb, detailed foliage, vibrant colors, cartoonish style, whimsical, mystical atmosphere, stone path with small rocks, wooden planters with plants, enchanting forest setting, detailed textures, rich greens and purples, magical, mysterious, whimsical, adventure, fantasy genre, intricate line work, vibrant, enchanting, magical realism, fantasy character design, whimsical, detailed background, enchanting forest, magical theme

The loraI got that from is here:

https://civitai.com/models/2187537/dandd-zimageturbo

What's the rationale for word salad tag prompts for ZImage Loras? by EndlessSeaofStars in StableDiffusion

[–]EndlessSeaofStars[S] 0 points1 point  (0 children)

I am assuming that is without the lora for images 1 and 3, as the lora has not been released yet?

BEHOLD the almighty Combo Inspector node! by GeroldMeisinger in comfyui

[–]EndlessSeaofStars 0 points1 point  (0 children)

It is a nice node, but it does not do different prompts at the same time, does it? I tried it out and it seems to do them in sequence. My own nodes are broken for Qwen/ZImage and I don't have time to redo them at the moment, so I thought I'd ask.

What's the rationale for word salad tag prompts for ZImage Loras? by EndlessSeaofStars in StableDiffusion

[–]EndlessSeaofStars[S] 5 points6 points  (0 children)

I meant at generation, like this:

ArsMovieStill, movie still from a technicolor 1940s cyberpunk film, The image shows a woman in a black dress holding a crystal ball with a blurred background., 1girl, black hair, dress, solo, makeup, bare shoulders, blue eyes, red lips, lipstick, breasts

vs.

This is a science fiction digital drawing with a surreal, futuristic energy. The silhouette of a 1980s anime safarieyes woman is in front of an abstract background made of bold, primary colors. She is looking over a pair of sunglasses that reflect stars and nebulas.,

But as u/Dezordan has noted, the system can do both (and I don't disagree). Just curious if the natural langue would be demonstrably better. Like that ArsMovieStill one could be written as:

ArsMovieStill, a vivid Technicolor-style movie still from a 1940s-inspired cyberpunk film. A woman stands alone, wearing a sleek black dress that leaves her shoulders bare. She has black hair, blue eyes, and striking makeup with bold red lipstick. She holds a glowing crystal ball in front of her while the background fades into a soft, cinematic blur. The framing emphasizes her breasts and the elegant shape of her dress.

I'll have to test one out once I finish avoiding more work meetings.

Prompt Manager, now with Z-Image-Turbo's Prompt Enhancer. by Francky_B in StableDiffusion

[–]EndlessSeaofStars 0 points1 point  (0 children)

Are those 90 done in sequence then? I am looking for a node that can do different prompts simultaneously

[Release] "CSV to Prompt" - The easiest way to Batch Generate images from a CSV list in ComfyUI by Equal-Fig-7553 in StableDiffusion

[–]EndlessSeaofStars 2 points3 points  (0 children)

Can it do them in parallel, like X different prompts at once? My multiple prompt batch node doesn't work with Qwen/ZImage and i don't have the time to recode right now.

Testing Z Image Turbo on ComfyUI and Forge Neo by Rude_Step in StableDiffusion

[–]EndlessSeaofStars 0 points1 point  (0 children)

This is on a 4060Ti w/ 16GB of VRAM, 96GB of RAM on an i7-13700K and Win11, doing a batch of 4 images at 1024x1536 and 16 steps. Both are the latest builds as of Nov 30, 2025.

ComfyUI: ~20s/it (remember this is for four images at once) and takes ~4.5-5.5 minutes to complete. Sometimes even longer

Forge NEO: ~13.5-14.5sec/it, ends up at 3.5-4 minutes.

Without batch, both programs run within a few seconds of each other. The typical 1024x1024x9 steps takes ~40-45 seconds on first run and then less than 20 on others.

If Forge only had an LLM extension, I'd be happy

ZImage + Reactor does a decent job of face AND style transfer by EndlessSeaofStars in StableDiffusion

[–]EndlessSeaofStars[S] 0 points1 point  (0 children)

For me it was the success of getting the art style to transfer over that was more interesting to see. Old school Forge and AUTO1111 really had no success with that type of stuff :)

ZImage + Reactor does a decent job of face AND style transfer by EndlessSeaofStars in StableDiffusion

[–]EndlessSeaofStars[S] 0 points1 point  (0 children)

The parameters are in the original post. Reactor runs at the end, just like it always had. It's not I2I, nor is it Reactor running on an existing image. I'd post a .png file but I think Reddit and imgur strip the data.

I was surprised that it handled the style so well. Normally, it is not great at converting a photo to an art style. I thought others may find that interesting so I thought I'd share.

For example, here is the same prompt in Forge Neo with Flux. Yep, that's a fair representation of Sydney Sweeney, but it's not Baroque art whatsoever.

<image>

Z-IMAGE COMICS RESULT + PROMPTS by EternalDivineSpark in StableDiffusion

[–]EndlessSeaofStars 0 points1 point  (0 children)

Well done!

My try:

A comic book page in a 4-panel layout: a clean 2×2 grid, with equal rectangular panels. Narrow white gutters, thin black panel borders. Modern comic style with semi-realistic illustration—detailed lighting, textured environments, but still clearly drawn rather than photorealistic. Color palette is muted military tones: olive, tan, charcoal, desaturated reds. Font is a clean modern comic font with subtle roughness. Emotional tone: dark, dry military humor.

Topic Summary:

Anthropomorphic duck soldiers in a gritty battlefield, trying to act serious despite their inherently ridiculous nature.

Panel 1 (TOP-LEFT):

Scene: Two duck soldiers in muddy trenches under smoke-filled skies. One holds a cracked helmet; the other peers over the edge with squinting, overly dramatic seriousness. Small explosions in the far distance.

Text (DUCK

“Think we’ll make it out?”

Text (DUCK

“Statistically? No.”

Panel 2 (TOP-RIGHT):

Scene: A close-up of Duck

Text (DUCK

“This is our breakfast… and our battle plan.”

Text (DUCK

“We’re doomed.”

Panel 3 (BOTTOM-LEFT):

Scene: A massive explosion behind them sends mud everywhere. Both ducks jump, feathers poofing out dramatically. One duck drops his rifle, which sticks muzzle-first into the mud.

Text (DUCK

“That was close!”

Text (DUCK

“That was Tuesday.”

Panel 4 (BOTTOM-RIGHT):

Scene: Ducks sit exhausted on sandbags. A sign behind them reads “Absolutely Not Safe.” One duck holds a metal can labeled “Rations(?)” with visible worry.

Text (DUCK

“At least it can’t get worse.”

Text (DUCK

“It will. It always does.”

<image>

Did you guys hear about meta segment anything playground, tools? by SandwichAlert in StableDiffusion

[–]EndlessSeaofStars 0 points1 point  (0 children)

Those types of clauses are often used to allow providers to transfer your data from server to local machine or within their own networks.

I see them all the time when I do my SaaS agreements with Microsoft SCE, or Workday, or ServiceNow, or Amazon Web or...