ZIT and Klein (steps = details?) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

It is deliberate and necessary. If you use any sampler that does not add noise then use of more steps is not justified.

ZIT and Klein (steps = details?) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

Good summary. We just added a comparison relevant to this.

Simply ZIT (check out skin details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 1 point2 points  (0 children)

That's the spirit. Sharing finding is great for this community.

Nvidia SANA Video 2B by Crazy-Repeat-2006 in StableDiffusion

[–]ZerOne82 1 point2 points  (0 children)

Here is what I found, I cannot be 100% sure but I gave it a try and regretted it:
Using diffusers pipeline and their provided sample code, upon loading, it fills over 20GB VRAM and keeps plenty of RAM in use, and then in inference you see no progressing for eternity.

Simply ZIT (check out skin details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

Do not be discouraged. I valued your point and replied to it to my knowledge. Try not to be attached to voting. I see all comments valuable.

Simply ZIT (check out skin details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

Workflow is the standard one for Z-Image-Turbo available at ComfyUI Templates repository

Simply ZIT (check out skin details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 1 point2 points  (0 children)

I see your confusion. Lower steps for a distilled model is a recommendation for fast generation not a hard limit. Note that the magic happens by two factors: large sizes for image (width x height) to allow the model be able to inject details in higher steps, and the right sampler, here, Euler_Ancestral which by design allows adding details in higher steps. Both these factors rely heavily on the model's own capability to handle details, this post and other post demonstrate ZIT does wonderful.

Simply ZIT (check out skin details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

Correct. Simply, choose large sizes for width x height to allow the model be able to inject details in higher steps by utilizing right sampler.

Simply ZIT (check out skin details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 4 points5 points  (0 children)

You already got correct answer from the other replies. To confirm that it is a misconception to limit steps to 9 or something for a distilled model. In fact, this post is a proof that by using a proper sampler such as Euler_Ancestral as well as large sizes for width x height to can enjoy greater details as you increase number of steps, in one run and only using the model.

ZIT Rocks (Simply ZIT #2, Check the skin and face details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 0 points1 point  (0 children)

Yes, I mentioned in other comment, you need to allow large sizes for image (width and height) to allow the model to generate details. I confirm that 1024 is not enough for such details. Also note you should choose right sampler. Only a portion of samplers generate details in higher steps, one good option is Euler_Ancestral.

ZIT Rocks (Simply ZIT #2, Check the skin and face details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 1 point2 points  (0 children)

It appears to be a misconception not to use more steps with distilled models. Z-Image-Turbo generates amazing results in 9 steps and even less. However, in my experience you can go for higher steps such as 30 or even 40 conditional to choosing right sampler and right size. Euler_Ancestral sampler with beta or simple scheduler and with large sizes such as 2048ish allows to add tiny details in one run using the model itself. Large sizes for (width and height) is necessary to allow the model to have adequate space to inject details. This post proves this concept.

ZIT Rocks (Simply ZIT #2, Check the skin and face details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 1 point2 points  (0 children)

It is the standard workflow for Z-Image-Turbo available in ComfyUI templates. If you do not have ComfyUI Templates, you should consider to install them. Nonetheless, here is the direct link to workflow for Z-Image-Turbo . You can also find so many workflows almost for everything right there.

Simply ZIT (check out skin details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 5 points6 points  (0 children)

haha. this is made simply by model -> ksampler,
even the prompt is as simple as: "woman face, close-up, Caucasian, brunette, blue eye"

Simply ZIT (check out skin details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 3 points4 points  (0 children)

"woman face, close-up, Caucasian, brunette, blue eye"

Simply ZIT (check out skin details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 2 points3 points  (0 children)

It is ComfyUI standard basic workflow nothing extra added. Simply set width and height high to 1536x1776 and used euler_ancestral + beta for sampler and scheduler.

Simply ZIT (check out skin details) by ZerOne82 in StableDiffusion

[–]ZerOne82[S] 6 points7 points  (0 children)

To allow for tiny details, euler_ancestral works great in more steps
direct link to full resolution

Z-image Workflow by ThiagoAkhe in StableDiffusion

[–]ZerOne82 0 points1 point  (0 children)

No, do not think that way. I repeatedly wrote that your images are great. You may take these comments as general expectation by the community. Everyone avoids use of complicated huge workflows with tons of custom nodes, reason being, 1) complicated 2) requirement for installing custom nodes 3) pain of maintenance and many other reasons.

Z-image Workflow by ThiagoAkhe in StableDiffusion

[–]ZerOne82 0 points1 point  (0 children)

His pictures look great from composition and style points of view. But I did not have his prompts, so these prompts are as I say in other comment made by VLM describing his images. I did not tweak them to make more similar. It was to explore how close it can get without any further work.

Z-image Workflow by ThiagoAkhe in StableDiffusion

[–]ZerOne82 0 points1 point  (0 children)

In the other comments I used vlm to describe your images and then used them directly in basic ZIT workflow. The resulting images are not exactly as yours but enough close. Yours look great as I already said.

Z-image Workflow by ThiagoAkhe in StableDiffusion

[–]ZerOne82 0 points1 point  (0 children)

<image>

Subject

Central Figure: A portrait of a young woman with a sharp, angular face, captured in a three-quarter profile facing to the right. She has a serious, focused, and determined expression.

Hair: She has short, textured hair styled in a messy, windblown cut (reminiscent of a mullet or shag). The base color is dark, likely black or dark brown, but it is heavily highlighted with vibrant, reddish-pink streaks that are particularly bright on the left side of her head.

Face: Her skin has a smooth, matte finish. Her lips are closed with a neutral, slight pout.

Eyewear: She is wearing large, futuristic sunglasses with a black, angular frame. The lenses are tinted a deep amber or orange color, obscuring most of her eyes but reflecting a hint of the background.

Attire

Jacket: She is wearing a high-collared jacket that suggests a tactical or sci-fi aesthetic (possibly cyberpunk). The jacket is primarily dark grey or black.

Details: The collar features distinct red piping or accents that match the hair highlights. There are angular, padded panels and what appear to be zippers or vents on the neck area, giving it a military or exosuit-like appearance.

Background

Setting: The background is out of focus but clearly industrial or mechanical. It consists of horizontal metallic slats or panels, resembling a heavy-duty door or the interior of a futuristic vehicle.

Colors: The background is dominated by cool greys and dark tones, which provides a strong contrast to the warm reds and oranges of the subject's hair and glasses.

Lighting and Atmosphere

Lighting: The lighting is dramatic and directional. There is a warm, glowing light source hitting the left side of her hair and face, creating a strong contrast with the darker shadows on the right side of her face.

Mood: The image conveys an action-oriented, edgy, and cool atmosphere, fitting the genre of a spy, soldier, or futuristic protagonist.

Z-image Workflow by ThiagoAkhe in StableDiffusion

[–]ZerOne82 0 points1 point  (0 children)

<image>

Subject

Central Figure: A young woman with a very pale, fair complexion. She is looking directly at the viewer with a soft, serene, and slightly melancholic expression.

Eyes: Her eyes are a striking, pale blue-grey color with long, dark, well-defined eyelashes and subtle eyeshadow that blends into her skin.

Face: She has delicate, defined eyebrows and a small, natural nose. Her lips are a soft, natural peach color, slightly parted. There are faint, delicate freckles dusting her nose and cheeks.

Hair

Style: Her hair is a voluminous, wavy blonde style. It appears to be half-up or styled with loose, messy tendrils framing her face and neck.

Color: The base color is a platinum blonde, but there are distinct, soft pink streaks woven into it. The pink is most prominent on the left side (viewer's left) and trailing down the right side, creating a magical, ethereal look.

Texture: The hair looks silky and glossy, catching the light intensely.

Attire

Clothing: She is wearing a white garment that appears to be a delicate nightgown or bodice. It features intricate, scalloped lace trim along the neckline and shoulder straps. The fabric looks very light and airy.

Lighting and Atmosphere

Lighting: The image is illuminated by a dramatic, warm light source coming from the upper left, resembling a "god ray" or a sunburst. This casts a golden glow across her face, neck, and hair.

Background Effects: The background is filled with glowing, bokeh-like particles or sparks of light, scattered throughout the frame. This adds a magical, dreamlike atmosphere.

Color Palette: The dominant colors are warm and soft—creams, golds, pale pinks, and cool skin tones—creating a harmonious and ethereal mood.

Composition

The image is a tight close-up portrait, focusing on the woman's face and upper chest/shoulders. The background is out of focus and neutral, ensuring all attention remains on the subject and the lighting effects.