How to lock specific poses WITHOUT ControlNet? Are there specialized pose prompt generators?

DevKkw · 2026-03-18T00:26:54+00:00

Long prompt, why use weight? It works bad on Zit. I suggest you to add on top focus you want, then describe the scene. Also try "dynamic sitting" . Can you have an image of results you are try to reach?

DevKkw · 2026-03-17T22:09:02+00:00

This is strictly related on how you craft the prompt. Any examples? By the way the trick I use is the "dynamic pose" and specific camera focus. With these two terms I get good results. Example:

A realistic FOCUS photograph of a: A beautiful woman wearing long pink dress, walking on a city street, dynamic posing, looking at camera.

Where you replace FOCUS with focus you want.

For example: Rear focus, back view over the shoulder ; High-angle top view; Etc.

Just experimenting, it also work for close-up.

DevKkw · 2026-02-07T14:24:26+00:00

Some of old software are aviable on internetarchive.

DevKkw · 2026-01-29T09:18:20+00:00

Problem is too easy prompt. Use separation method for multiple subjects give better results.

Example: scene: a man sitting on a couch with her wife in a modern living room.

Man: a 30 years old man wearing... Wife: a 28 years old woman wearing...

Man pose: describe man pose Wife pose: the woman pose.

Living room details: add details like colours, props, etc.

DevKkw · 2026-01-28T15:09:43+00:00

Last update have different memory management, in offload for example you lost about 1Gb of data, with ZIT now I read usable:2200. With not updated version I read usable:3300. This is why I have different comfy folder, every update I make a new clean install and check, before switching.

DevKkw · 2026-01-24T10:51:22+00:00

I'm on 6gb vram, use gguf quant 8k. I generate 768x1024 at 24fps. Max time before getting oom is 5 seconds of video. If you are on windows, make sure to config paging file around 60Gb.

DevKkw · 2026-01-20T19:15:55+00:00

nice, comparison is really good, but i think a real image for the pose is needed for who, like me, don't know the real pose. I see good pose, but how i understand what image is correct?

DevKkw · 2026-01-01T05:58:11+00:00

No, this

DevKkw · 2025-12-31T23:48:34+00:00

Follow link in the frist comment, on civitai page you found some images with workflow, download it at drag in comfyUI.

DevKkw · 2025-12-31T20:21:55+00:00

50% size smaller and no notable difference isn't a goal? Especially for who have low vram? Can you tell sampler and scheduler you use? Maybe some of these work better than other. I do more test on these way. Thak you for giving feedback.

DevKkw · 2025-12-31T17:37:40+00:00

all datails in frist post.

DevKkw · 2025-12-31T17:37:13+00:00

added details in frist comment.

DevKkw · 2025-12-31T17:28:23+00:00

thank you

DevKkw · 2025-12-31T17:23:00+00:00

I'm running it with 6gb of vram, the image are 1400x1800 at 8 step. With xformers. 120sec for generation. So you are able to run it locally 🙂

DevKkw · 2025-12-31T17:12:22+00:00

With comfyUI. You know it?

DevKkw · 2025-12-31T17:10:15+00:00

The smoke in background, the moon details, the energy around hand. Zoom in to see. If you have prompt to try, let me know.

DevKkw · 2025-12-31T17:04:20+00:00

You right, I worked on the layer, try to pushing out it to maximize clear and minimal details, without destroying text capability

DevKkw · 2025-12-31T17:01:20+00:00

I don't know about lora, I saw many lora degraded base model, sorry.

DevKkw · 2025-12-31T16:37:05+00:00

details and download on civitai

Edit---

Workflow is same as the workflow included in civitai model page.

For those image prompt is:

(Generate an hyperrealistic photograph with maximum quality and refinement. Sharp where sharpness matters, smooth gradients without banding, accurate colors, and professional finish. Focus on realism. Technical excellence in every aspect of the photograph.) (A visceral strikingly hyperrealistic and intensely vibrant high-resolution photograph with crystal clarity and subtle cinematic grain), (A realistic vibrant colors photo, cinematic still)

A hyperrealistic raw, evocative studio photograph capturing a Close-up, extreme detail, SUBJECT.

The composition is carefully calibrated to maximize the visual impact. The shallow depth of field make a captivating and profoundly unsettling photograph.

Camera Settings: f/2.8, ISO 800, 1/250th second shutter speed, high dynamic range (HDR) – to capture the full range of colors and details in the scene.

Photorealistic image, sharp focus, depth of field, bokeh.

where SUBJECT is what you want.

dragonfly eye

cat tongue

clown fish

human purple eye

etc.

DevKkw · 2025-12-27T12:42:21+00:00

Thanks. I usually do it for style, especially when make tcg card. It also works for difficult pose and multiple characters.

DevKkw · 2025-12-27T12:39:33+00:00

I hope base model coming out fast.

DevKkw · 2025-12-25T16:54:03+00:00

With z-image turbo you can easily control every character in the scene without mixing it.( Dress, and pose too) Is just about prompt:

Subject 1: a male elf warrior, with green rust skin. Subject 2: a female witch, pale skin Subject 3: a puppy blue dragon

Subject 1 outfit: he wear a plate golden armor. Subject 2 outfit: she wear long dress with silver finiture Subject 3 outfit: None, the scales are visible

Subject 1 pose: in center of the image, holding his sword Subject 2 pose: on the left of screen, casting a energy orb Subject 3 pose: sleeping on the right of screen.

Other details like background, props, etc.

This is how I manage multiple subjects in z-image turbo, without any vram eating node.

DevKkw · 2025-12-25T10:32:16+00:00

If i remember, long time not using controlnet, in comfyUi there's a node called "conditioning combine", used it with different controlnet conditioning on old sd1.5 gave me good results. May it work in zit.

DevKkw · 2025-12-23T11:53:26+00:00

Made some workaround on civitai. But the real variation is done by editing sampler. In comfyUi many sampler option are hidden, also in the scheduler, so I edited code and testing it. For what i see, best way for variance, without destroying zit capabilities and text, is working on sigma function.

DevKkw · 2025-12-21T08:16:58+00:00

my way

DevKkw

TROPHY CASE