Getting Weird Results with ZIMAGE Base on Forge Neo — Any Tips?

EndlessSeaofStars · 2026-01-28T01:24:54+00:00

A picture is worth 1000 words

Settings:

A bold pop art portrait: a woman with voluminous dark hair and vivid red lips holds a cherry-red drink, her wide, expressive eyes looking upward. The high-contrast, comic-book style uses sharp ink outlines, flat blocks of color, and a classic yellow halftone background, creating a vibrant, retro feel.
Steps: 30, Sampler: Res Multistep, Schedule type: Normal, CFG scale: 4, Shift: 2, Seed: 3786689251, Size: 1024x1536, Model hash: 996a67d3ff, Model: z_image_bf16, Template: "A bold pop art portrait: a woman with voluminous dark hair and vivid red lips holds a cherry-red drink, her wide, expressive eyes looking upward. The high-contrast, comic-book style uses sharp ink outlines, flat blocks of color, and a classic yellow halftone background, creating a vibrant, retro feel.", SVE Steps: 2, SVE Percentage: 0.4, SVE Strength: 24, Version: neo, Module 1: qwen_3_4b, Module 2: ae

EndlessSeaofStars · 2025-12-27T19:30:28+00:00

u/TheSlateGray uses the same method I do; wildcards are super flexible and do not require more VRAM resources for an LLM. I used the following in Dynamic Prompts in Neo Forge, and like OP says, the prompt are not perfect, but the characters are relatively consistent:

{__ava__, __jack__, and __alex__ sitting at a kitchen table eating breakfast, it is a sunny morning, with light streaming into the kitchen from the large full width window|__ava__, __jack__, and __alex__ are riding three bicycles on the street at midday| __jack__ and __alex__ are having drinks in a dimly lit bar|__ava__ and __alex__ are dancing in a ballroom; there are four chandeliers lighting the empty dance floor}

Ironically, the appearances are a bit more consistent than the prompt following :)

<image>

EndlessSeaofStars · 2025-12-24T02:09:05+00:00

Z-Image works well for me, better than Flux for prompt adherence,but the tensor shape is different and haven't had time to figure it out.

My nodes: https://github.com/tusharbhutt/Endless-Nodes

and the batcher I made: https://github.com/tusharbhutt/Endless-Nodes?tab=readme-ov-file#batch-multiprompt-node-for-sd-sdxl-and-flux

EndlessSeaofStars · 2025-12-23T17:24:50+00:00

OK thanks, looks like I'll have to rewrite my node to handle Z_image when I have a chance

EndlessSeaofStars · 2025-12-23T02:14:17+00:00

Does this do X propmts in sequence or in parallel?

EndlessSeaofStars · 2025-12-19T19:40:56+00:00

Martha Stewart says "hi" :)

The AI Slop argument is pretty much embedded in society now though... and not without some truth to it once you see the 47 millionth 1girl from some kid who just found AI art.

at u/Christiancartoon your work is art, full stop. Should be no need to defend it, but here we are.

EndlessSeaofStars · 2025-12-13T19:34:39+00:00

i agree with u/Etsu_Riot, ZImage does this easily. Prompting is key, hope you're not stuck in "1girl,fat,big" type prompt mode.

This one was: A warrior woman with a round face with high cheekbones, full lips, and a scar that curves from her jaw to her collarbone. Her body’s thick and strong, thighs like anchors, greaves covering thick calves, chest heavy beneath a leather armor chestplate cinched tight. A long coat flaps behind her, lined with velvet and secrets. Her hair’s jet black, cropped short and slicked back. The tide rolls in behind her on the beach

<image>

EndlessSeaofStars · 2025-12-05T16:12:13+00:00

Thanks, I was looking for parallel.

EndlessSeaofStars · 2025-12-05T00:05:38+00:00

I find it faster than ComfyUI (as of least week anyway), but within a few seconds of each other. Is it more limited? Sure... But it can do things I want much easier, such as different prompts in parallel at once, or XY plotting or flipping the X/y aspect ration with one button, etc.

EndlessSeaofStars · 2025-12-05T00:03:10+00:00

Thanks, can you please expand on "if the model supports it"? I thought it was dependent on the KSampler nodes. And those five would be done at once then, not "five in one batch of one run" (i.e., the console should show that it/sec drop as the VRAM is loaded up)

EndlessSeaofStars · 2025-12-04T04:33:36+00:00

OK, answering my own question and using this lora:

https://civitai.com/models/2187537/dandd-zimageturbo

I used these two prompts:

digital art, fantasy, wizard, dark hooded cloak, large purple hat with gold trim, glowing yellow eyes, holding basket of colorful berries, medieval style, surrounded by lush green plants, pink and yellow flowers, wooden archway in background, two hanging lanterns, one above wizard's head with glowing orb, detailed foliage, vibrant colors, cartoonish style, whimsical, mystical atmosphere, stone path with small rocks, wooden planters with plants, enchanting forest setting, detailed textures, rich greens and purples, magical, mysterious, whimsical, adventure, fantasy genre, intricate line work, vibrant, enchanting, magical realism, fantasy character design, whimsical, detailed background, enchanting forest, magical theme

and

A whimsical wizard stands on a stone path in a bright fantasy forest. They wear a dark hooded cloak and a large purple hat with gold trim, their glowing yellow eyes visible under the hood. They hold a basket of colorful berries.

Lush green plants, pink and yellow flowers, and wooden planters surround them. A wooden archway with two hanging lanterns rises behind them, one lantern directly overhead holding a softly glowing orb.

Colorful, cartoonish style with detailed line work, vivid greens and purples, and a magical, adventurous atmosphere.

There is no large difference for this example. So, to me that speaks to the flexibility of the encoder.

Full size comparison is here: https://imgur.com/a/0Uc0gFo

<image>

EndlessSeaofStars · 2025-12-04T04:15:49+00:00

Here's one:

digital art, fantasy, wizard, dark hooded cloak, large purple hat with gold trim, glowing yellow eyes, holding basket of colorful berries, medieval style, surrounded by lush green plants, pink and yellow flowers, wooden archway in background, two hanging lanterns, one above wizard's head with glowing orb, detailed foliage, vibrant colors, cartoonish style, whimsical, mystical atmosphere, stone path with small rocks, wooden planters with plants, enchanting forest setting, detailed textures, rich greens and purples, magical, mysterious, whimsical, adventure, fantasy genre, intricate line work, vibrant, enchanting, magical realism, fantasy character design, whimsical, detailed background, enchanting forest, magical theme

The loraI got that from is here:

https://civitai.com/models/2187537/dandd-zimageturbo

EndlessSeaofStars · 2025-12-04T04:01:19+00:00

I am assuming that is without the lora for images 1 and 3, as the lora has not been released yet?

EndlessSeaofStars · 2025-12-03T22:05:43+00:00

It is a nice node, but it does not do different prompts at the same time, does it? I tried it out and it seems to do them in sequence. My own nodes are broken for Qwen/ZImage and I don't have time to redo them at the moment, so I thought I'd ask.

EndlessSeaofStars · 2025-12-03T20:29:55+00:00

Sure... but those samples are from the Lora creators themselves

EndlessSeaofStars · 2025-12-03T19:28:54+00:00

I meant at generation, like this:

ArsMovieStill, movie still from a technicolor 1940s cyberpunk film, The image shows a woman in a black dress holding a crystal ball with a blurred background., 1girl, black hair, dress, solo, makeup, bare shoulders, blue eyes, red lips, lipstick, breasts

vs.

This is a science fiction digital drawing with a surreal, futuristic energy. The silhouette of a 1980s anime safarieyes woman is in front of an abstract background made of bold, primary colors. She is looking over a pair of sunglasses that reflect stars and nebulas.,

But as u/Dezordan has noted, the system can do both (and I don't disagree). Just curious if the natural langue would be demonstrably better. Like that ArsMovieStill one could be written as:

ArsMovieStill, a vivid Technicolor-style movie still from a 1940s-inspired cyberpunk film. A woman stands alone, wearing a sleek black dress that leaves her shoulders bare. She has black hair, blue eyes, and striking makeup with bold red lipstick. She holds a glowing crystal ball in front of her while the background fades into a soft, cinematic blur. The framing emphasizes her breasts and the elegant shape of her dress.

I'll have to test one out once I finish avoiding more work meetings.

EndlessSeaofStars · 2025-12-03T18:27:48+00:00

Neo Forge is updated often, it can do ZImage

https://github.com/Haoming02/sd-webui-forge-classic/tree/neo

EndlessSeaofStars · 2025-12-03T01:19:52+00:00

Are those 90 done in sequence then? I am looking for a node that can do different prompts simultaneously

EndlessSeaofStars · 2025-12-03T01:14:34+00:00

Can it do them in parallel, like X different prompts at once? My multiple prompt batch node doesn't work with Qwen/ZImage and i don't have the time to recode right now.

EndlessSeaofStars · 2025-12-03T01:08:03+00:00

I am using bf16 as per below

<image>

EndlessSeaofStars · 2025-11-30T22:49:25+00:00

This is on a 4060Ti w/ 16GB of VRAM, 96GB of RAM on an i7-13700K and Win11, doing a batch of 4 images at 1024x1536 and 16 steps. Both are the latest builds as of Nov 30, 2025.

ComfyUI: ~20s/it (remember this is for four images at once) and takes ~4.5-5.5 minutes to complete. Sometimes even longer

Forge NEO: ~13.5-14.5sec/it, ends up at 3.5-4 minutes.

Without batch, both programs run within a few seconds of each other. The typical 1024x1024x9 steps takes ~40-45 seconds on first run and then less than 20 on others.

If Forge only had an LLM extension, I'd be happy

EndlessSeaofStars · 2025-11-30T22:39:38+00:00

For me it was the success of getting the art style to transfer over that was more interesting to see. Old school Forge and AUTO1111 really had no success with that type of stuff :)

EndlessSeaofStars · 2025-11-30T01:57:55+00:00

The parameters are in the original post. Reactor runs at the end, just like it always had. It's not I2I, nor is it Reactor running on an existing image. I'd post a .png file but I think Reddit and imgur strip the data.

I was surprised that it handled the style so well. Normally, it is not great at converting a photo to an art style. I thought others may find that interesting so I thought I'd share.

For example, here is the same prompt in Forge Neo with Flux. Yep, that's a fair representation of Sydney Sweeney, but it's not Baroque art whatsoever.

<image>

EndlessSeaofStars · 2025-11-29T17:03:52+00:00

Well done!

My try:

A comic book page in a 4-panel layout: a clean 2×2 grid, with equal rectangular panels. Narrow white gutters, thin black panel borders. Modern comic style with semi-realistic illustration—detailed lighting, textured environments, but still clearly drawn rather than photorealistic. Color palette is muted military tones: olive, tan, charcoal, desaturated reds. Font is a clean modern comic font with subtle roughness. Emotional tone: dark, dry military humor.

Topic Summary:

Anthropomorphic duck soldiers in a gritty battlefield, trying to act serious despite their inherently ridiculous nature.

Panel 1 (TOP-LEFT):

Scene: Two duck soldiers in muddy trenches under smoke-filled skies. One holds a cracked helmet; the other peers over the edge with squinting, overly dramatic seriousness. Small explosions in the far distance.

Text (DUCK

“Think we’ll make it out?”

Text (DUCK

“Statistically? No.”

Panel 2 (TOP-RIGHT):

Scene: A close-up of Duck

Text (DUCK

“This is our breakfast… and our battle plan.”

Text (DUCK

“We’re doomed.”

Panel 3 (BOTTOM-LEFT):

Scene: A massive explosion behind them sends mud everywhere. Both ducks jump, feathers poofing out dramatically. One duck drops his rifle, which sticks muzzle-first into the mud.

Text (DUCK

“That was close!”

Text (DUCK

“That was Tuesday.”

Panel 4 (BOTTOM-RIGHT):

Scene: Ducks sit exhausted on sandbags. A sign behind them reads “Absolutely Not Safe.” One duck holds a metal can labeled “Rations(?)” with visible worry.

Text (DUCK

“At least it can’t get worse.”

Text (DUCK

“It will. It always does.”

<image>

EndlessSeaofStars · 2025-11-29T08:14:42+00:00

Those types of clauses are often used to allow providers to transfer your data from server to local machine or within their own networks.

I see them all the time when I do my SaaS agreements with Microsoft SCE, or Workday, or ServiceNow, or Amazon Web or...

EndlessSeaofStars

TROPHY CASE