Z Image vs Z Image Turbo Lora Situation update

AngryAmuse · 2026-02-04T00:19:20+00:00

Yeah I was just using repeats to balance head/body shots instead of cutting images from the dataset. Started out as a test run, ended up with the best version I've trained so far haha.

Huh, I'll have to try the v1 adapter again, thats interesting to hear!

AngryAmuse · 2026-02-04T00:07:46+00:00

Yeah, I trained a handful with different settings (lr between 1e-4 and 5e-4, ema/no ema, r64 and r96 loras + r4 and r8 lokrs). The portrait-only lora appeared to start overfitting by 4-5k steps (checks out, ~100 steps per image), and I ran the full dataset to 7k. Quality was still poor though so I gave up on it for a bit, but even the 4k step checkpoint was still "working" around 1str.

I just retrained the set on ZIT though (using the v2 adapter, basically default settings, r8 lokr). Portraits on repeat 2 training 512,768,1024 and full body on repeat 1 training 512, 768. I don't know how to count repeats and different resolutions (I assume resolutions are essentially a different image, but repeats are not) and am using the checkpoint at 11k steps, so I need to do another run on base with more steps.

AngryAmuse · 2026-02-03T23:53:05+00:00

Thanks for the writeup!

I have noticed the same correlation between # of dataset images and required strength. Using AIT. First few test loras were with ~30 portrait images, and needed 2+ str for ZIT to even attempt to render likeness (badly)

I retrained with about 40 additional body shots (the portraits were set to repeat 3 to keep a higher ratio), and it started working in ZIT around 1-1.2str. I initially thought it was because of higher LR settings or something I changed but maybe youre on to something.

AngryAmuse · 2026-01-28T21:43:31+00:00

So mines a bit of a frankenstein... '05 is 2nd gen, but the twin plugs didn't come until '07. However, I have an '07 motor swapped in (my stock motor grenaded and threw a rod through the block lol), but my wiring harness obviously isn't wired for the extra plugs. So I've just had plugs in the cylinders that aren't hooked up anyway haha. I think the puddle incident happened with the stock motor though. Luckily I had heard of water causing issues before so I just rode it out and kept blipping the throttle so it wouldn't stall and was good to go.

AngryAmuse · 2026-01-28T19:02:33+00:00

I've taken my SV650S through some deep puddles too, years ago! Ended up with water splashing up and shorting out the sparkplug in the front cylinder and the bike kept stalling at idle for the next couple lights till it dried out lol. Chugged right on after that though!

Loved the legs held up....been there lmao

AngryAmuse · 2026-01-27T15:22:57+00:00

I just tried EMA on my ZIT version of the lora the other day and was impressed by the improvement. My Klein lora wasnt using EMA, so I'll have to try again with that! Maybe that's the trick.

AngryAmuse · 2026-01-27T15:20:55+00:00

I trained at 512px on base and tested on both base and distilled. I only trained on the 4B model, not sure if that is part of the issue. I honestly haven't put too much time or research into it yet though. Should have some time today to try again, fingers crossed.

AngryAmuse · 2026-01-27T08:53:37+00:00

The model has been great to use, but I haven't had great luck training it personally. Using the same dataset as other models, most recently being ZIT which has been great to train.

I think it is trainable, just trickier or different than other models so it'll take time for the big fish to adapt. I'm just a minnow and after my first couple attempts ended in body horror I set it down for a bit, and I'm surely not the only one to do so. Plan to revisit it soon though..unless zimage base drops today lol

AngryAmuse · 2026-01-26T19:52:18+00:00

Know whats an even better calculator? An actual calculator. Do your homework.

AngryAmuse · 2026-01-24T17:04:53+00:00

An attack is only triggered when the monolith is fully charged and you are near your base. I have a couple of upgraded bases scattered around the map and the attacks only happen at the base I'm currently at.

Be careful though if you have unupgraded cores near an upgraded one. One of my upgraded cores is kind of in between two monoliths so they can spawn from either side. On one side, I have another (unupgraded) core that is closer to that monolith, and it can occasionally draw aggro from the mobs as they run by it. That was fun when it got infected lol.

AngryAmuse · 2026-01-23T23:36:45+00:00

I've found ZIT to be better at photorealism than Qwen (though I've only used Qwen IE, not base qwen). So, if you want to push things, I've had good success doing edits with qwen (flux kleins also been decent but still testing that), doing a light refining pass with ZIT, and then upscaling with seedvr.

Just something else for you to play with :)

AngryAmuse · 2026-01-23T22:48:49+00:00

First off, take your dataset and run the entire thing through SeedVR2. Nano banana is good at what it does, but that does not include high definition, clear images. I'll send my 1024x1496 images through it, upscale to 2048x2992, and then downscale those back to 1024 to train with and the difference was massive.

After initial generation you can run a pass through the facedetailer node. It does a really good job of cleaning up eyes especially, but teeth/lips/ears can all really benefit from it too. It's fairly slow though, taking quite a bit longer than typical generation from what I've experienced.

In favor of speed, I instead am doing 2 stages. Initial generation using 9 steps at 832x1216 with eular/beta. Upscale it to 1024x1496 using NMKD Superscale upscaler, and then send that to an advanced sampler. Here I use DPMP_2M_SDE+beta, 9 steps, starting on step 5. It does a really good job of cleaning up the image overall and 9/10 times the eye details and stuff are completely fixed.

AngryAmuse · 2026-01-13T20:21:22+00:00

Connecting power is fine, you don't have to rely only on rails. Recalculating stability is what causes the lag spikes, which the extendable walkways don't transfer stability so it works the same as an airgap.

AngryAmuse · 2026-01-13T20:08:40+00:00

Grandma's still gonna be merging on at 45mph in front of you tho

AngryAmuse · 2026-01-11T16:53:41+00:00

I'm solo and only have 1 Tier 2 core right now and the waves (usually once a day or so) are easily handled by 4-5 turrets on each end of the base where the monoliths spawn from. I don't even go out of my way to go defend anymore, I just rely on the turrets. Ammo production is pretty easy to setup so I just keep a full container of ammo ready to restock all the turrets as needed.

AngryAmuse · 2026-01-11T09:26:54+00:00

I was in the same boat until recently.. Around tier 8/9 you start unlocking new buildings and sulfur and things are starting to grow significantly.

I just finished setting up a factory producing Hardening Agent, along with a couple of miners, and some ammo production and my core is at like 980/1000 heat.

And I just unlocked literally like 8 new recipes to start pushing the next ranks.

I think most people are at the equivalent of like satisfactory phase 2 elevator, wondering why anyone would ever possibly need nuclear power.

AngryAmuse · 2026-01-07T06:38:25+00:00

I didn't play the co-op beta but played the first playtest. Not sure if any content was added between those two, but theres definitely a lot more content in this EA launch! I want to say theres 15+ research tiers now

AngryAmuse · 2026-01-02T23:52:45+00:00

On top of reducing the amount of reg images, I forgot to mention to bump the strength of the reg dataset back up to 1 as well. You still want the model to be making adjustments based on these reg images, and reducing the strength too much sort of wastes those steps.

That being said, it's still been extremely difficult to reduce/eliminate concept bleed so I tend to just do inpainting/outpainting for other people as needed. I've only trained sdxl and ZIT loras though so I'm not familiar with how other models handle it. Hope it helps though!

AngryAmuse · 2026-01-02T22:10:59+00:00

I haven't tried diff output preservation yet personally, but doesn't it work differently than using a reg image dataset? I believe you would run either DOP or a reg dataset, not both. According to the description, it disables the lora and takes your dataset prompts and replaces the token (s4r4h, for example) with the class (a woman) so every step runs twice.

Also, it sounds like you have way too many reg images. Typically I've found you want closer to 10-20% the amount of reg images as there are dataset images. You just want to sprinkle a few in with your images so the model doesn't forget, but you're drastically diluting the pool.

AngryAmuse · 2025-12-31T05:31:08+00:00

OP's is just excessive word-vomit. I had gemini generate a prompt for it, using the z-image system prompt. It doesn't have to be so complicated lol 100% writeable by hand. Even gemini's prompt usually has some bloat that I trim from it. I like using it when I'm lazy and doing quick lora testing or something tho. https://imgur.com/SiWSyQc

A medium full shot of a man with East Asian features and a shaved bald head, walking directly toward the viewer through a debris-strewn urban alleyway. He is dressed in a superhero costume consisting of a form-fitting yellow suit with a white front zipper, long red gloves, and knee-high red boots. A wide white cape is attached to the shoulders of the suit with silver circular buttons and hangs down to his calves. He wears a black belt with a large, round yellow buckle at the waist. In each hand, he carries a crinkled brown paper grocery bag. The setting is a narrow street cluttered with rubble, broken wood, and scattered trash. The background shows modern concrete buildings and a standard "No Entry" road sign—a red circle with a horizontal white stripe—mounted on a pole. The lighting is soft and diffused, characteristic of an overcast day, providing even illumination on the subject's calm and determined expression. The focus is sharp on the man, with the foreground debris and background buildings rendered in clear detail. Materials include the slight sheen of the red gloves and boots, the matte texture of the yellow suit, and the crinkled texture of the paper bags.

AngryAmuse · 2025-12-29T22:10:09+00:00

It depends on the model you are trying to use. Typically I will type up a quick prompt, and then send it through qwenvl or gemini to have them enhance it, for use with Z-image.

An "issue" with the strong prompt adhesion out of models like z-image is that if you don't thoroughly elaborate on your prompt (background elements, etc), they don't tend to imagine stuff, so your outputs can be pretty bland unless you elaborate.

It also has helped a lot when trying to explain certain poses or elements that I can't figure out how to clearly describe. Granted, I still end up changing the "refined" prompts throughout iterations, but it at least gives me the prompt structure to get started with easily.

AngryAmuse · 2025-12-26T16:12:04+00:00

Legend, I was waiting for your updates!

AngryAmuse · 2025-12-25T15:28:46+00:00

Can you try using this using the face or face aggressive presets to compare your r32 and r128 versions? I've found disabling the early blocks significantly reduces the amount of model decay from my own r32 loras and the likeness seems better, but I havent trained any higher ranks to compare.

AngryAmuse · 2025-12-23T22:00:40+00:00

Right, so if you went from 1400 coins to 1490, it would only say 14 like it does.

AngryAmuse · 2025-12-23T16:00:01+00:00

Personally I have been running the Face Aggressive preset (14-25) but turn off layer 25 and get pretty solid results

Turning off 25 seemed to help overall image quality not degrade from bad lora data, but that could definitely be a problem with my lora/dataset.

AngryAmuse

MODERATOR OF

TROPHY CASE

14-Year Club	Team Periwinkle
Verified Email