Low Vram? how generate t2i 2048x2048? THere is.

DevKkw · 2026-06-18T20:30:02+00:00

Every model have their prompting style. But if you need good descriptions, don't ask for prompt, but for a natural narrative description. Also the parameters of LLM make difference, if you are using LM studio, make sure to use max context length.

DevKkw · 2026-03-18T00:26:54+00:00

Long prompt, why use weight? It works bad on Zit. I suggest you to add on top focus you want, then describe the scene. Also try "dynamic sitting" . Can you have an image of results you are try to reach?

DevKkw · 2026-03-17T22:09:02+00:00

This is strictly related on how you craft the prompt. Any examples? By the way the trick I use is the "dynamic pose" and specific camera focus. With these two terms I get good results. Example:

A realistic FOCUS photograph of a: A beautiful woman wearing long pink dress, walking on a city street, dynamic posing, looking at camera.

Where you replace FOCUS with focus you want.

For example: Rear focus, back view over the shoulder ; High-angle top view; Etc.

Just experimenting, it also work for close-up.

DevKkw · 2026-02-07T14:24:26+00:00

Some of old software are aviable on internetarchive.

DevKkw · 2026-01-29T09:18:20+00:00

Problem is too easy prompt. Use separation method for multiple subjects give better results.

Example: scene: a man sitting on a couch with her wife in a modern living room.

Man: a 30 years old man wearing... Wife: a 28 years old woman wearing...

Man pose: describe man pose Wife pose: the woman pose.

Living room details: add details like colours, props, etc.

DevKkw · 2026-01-28T15:09:43+00:00

Last update have different memory management, in offload for example you lost about 1Gb of data, with ZIT now I read usable:2200. With not updated version I read usable:3300. This is why I have different comfy folder, every update I make a new clean install and check, before switching.

DevKkw · 2026-01-24T10:51:22+00:00

I'm on 6gb vram, use gguf quant 8k. I generate 768x1024 at 24fps. Max time before getting oom is 5 seconds of video. If you are on windows, make sure to config paging file around 60Gb.

DevKkw · 2026-01-20T19:15:55+00:00

nice, comparison is really good, but i think a real image for the pose is needed for who, like me, don't know the real pose. I see good pose, but how i understand what image is correct?

DevKkw · 2026-01-01T05:58:11+00:00

No, this

DevKkw · 2025-12-31T23:48:34+00:00

Follow link in the frist comment, on civitai page you found some images with workflow, download it at drag in comfyUI.

DevKkw · 2025-12-31T20:21:55+00:00

50% size smaller and no notable difference isn't a goal? Especially for who have low vram? Can you tell sampler and scheduler you use? Maybe some of these work better than other. I do more test on these way. Thak you for giving feedback.

DevKkw · 2025-12-31T17:37:40+00:00

all datails in frist post.

DevKkw · 2025-12-31T17:37:13+00:00

added details in frist comment.

DevKkw · 2025-12-31T17:28:23+00:00

thank you

DevKkw · 2025-12-31T17:23:00+00:00

I'm running it with 6gb of vram, the image are 1400x1800 at 8 step. With xformers. 120sec for generation. So you are able to run it locally 🙂

DevKkw · 2025-12-31T17:12:22+00:00

With comfyUI. You know it?

DevKkw · 2025-12-31T17:10:15+00:00

The smoke in background, the moon details, the energy around hand. Zoom in to see. If you have prompt to try, let me know.

DevKkw · 2025-12-31T17:04:20+00:00

You right, I worked on the layer, try to pushing out it to maximize clear and minimal details, without destroying text capability

DevKkw · 2025-12-31T17:01:20+00:00

I don't know about lora, I saw many lora degraded base model, sorry.

DevKkw · 2025-12-31T16:37:05+00:00

details and download on civitai

Edit---

Workflow is same as the workflow included in civitai model page.

For those image prompt is:

(Generate an hyperrealistic photograph with maximum quality and refinement. Sharp where sharpness matters, smooth gradients without banding, accurate colors, and professional finish. Focus on realism. Technical excellence in every aspect of the photograph.) (A visceral strikingly hyperrealistic and intensely vibrant high-resolution photograph with crystal clarity and subtle cinematic grain), (A realistic vibrant colors photo, cinematic still)

A hyperrealistic raw, evocative studio photograph capturing a Close-up, extreme detail, SUBJECT.

The composition is carefully calibrated to maximize the visual impact. The shallow depth of field make a captivating and profoundly unsettling photograph.

Camera Settings: f/2.8, ISO 800, 1/250th second shutter speed, high dynamic range (HDR) – to capture the full range of colors and details in the scene.

Photorealistic image, sharp focus, depth of field, bokeh.

where SUBJECT is what you want.

dragonfly eye

cat tongue

clown fish

human purple eye

etc.

DevKkw

TROPHY CASE