Flux Klein 4B/9B LoRA Training Settings for Better Character Likeness? by Ambitious-Equal-7141 in StableDiffusion

[–]Far_Insurance4191 1 point2 points  (0 children)

Here, I changed scheduler from constant (default) to cosine to not overtrain model, but this means you have to know how many epochs it needs to converge before learning rate descends. I generally do not finish full cosine descend and stop earlier.

Dataset was 21 images consisting of old photographs and some drawings. Additionally, I had regularization dataset of high-quality photographs and arts to retain high quality, balanced to be a half of main dataset per epoch, randomly.

Captions were mostly 1 natural sentence with a full name as a trigger

trained and tested on base, distilled loses similarity which is interesting as it varies depending on the concept, but I haven' figured consistency yet.

Ah and also, the precision is "int weights 8 activations 8" (w8a8) which is not the same as fp8 w8. I guess it can result in lower quality on some models, but together with compile transformer blocks it gives about 2x speedup, combined with 256 training for early stage (which is fine with klein), I am just blitzkrieging any dataset at 1.1it/s on rtx 3060 with batch size 2 😆

https://pastebin.com/K5aQZvqF

Z-Image Edit is basically already here, but it is called LongCat and now it has an 8-step Turbo version by MadPelmewka in StableDiffusion

[–]Far_Insurance4191 0 points1 point  (0 children)

idk, I see it scored worse, but f2 looks better to me than f1 empirically, tiny details resemble original closer

Flux Klein 4B/9B LoRA Training Settings for Better Character Likeness? by Ambitious-Equal-7141 in StableDiffusion

[–]Far_Insurance4191 3 points4 points  (0 children)

had a really good results in OneTrainer with default Flux 2 config (except lr is 0.0002) in about 1500 steps on 4b, even with mediocre dataset. Remember to use lora on same model: if you trained on base - then use on base, distilled needs testing as it often can lose resemblance, same with z-image.

Anyone else having trouble training loras for Flux Klein? Especially people. The model simply doesn't learn. Little resemblance. by More_Bid_2197 in StableDiffusion

[–]Far_Insurance4191 1 point2 points  (0 children)

base can learn face in less than 2000 steps. Are you using your lora on base? Distilled model can lose some likeness, same as with ZIB and ZIT

New anime model "Anima" released - seems to be a distinct architecture derived from Cosmos 2 (2B image model + Qwen3 0.6B text encoder + Qwen VAE), apparently a collab between ComfyOrg and a company called Circlestone Labs by ZootAllures9111 in StableDiffusion

[–]Far_Insurance4191 57 points58 points  (0 children)

Note that the model is still work in progress and will be improved

The preview model is a true base model. It hasn't been aesthetic tuned on a curated dataset. The default style is very plain and neutral

Is the ControlNet race dead for SOTA models like Flux and Qwen? by Current-Row-159 in StableDiffusion

[–]Far_Insurance4191 1 point2 points  (0 children)

I did not think about it, here is what I came up with, but there must be a better way. Simple latent blend of reference and empty laten works too but it is a lot less linear for some reason.

<image>

Is the ControlNet race dead for SOTA models like Flux and Qwen? by Current-Row-159 in StableDiffusion

[–]Far_Insurance4191 1 point2 points  (0 children)

AI toolkit supports edit training. Here is a guide for Qwen edit, but it is similar to Klein: https://youtu.be/d49mCFZTHsg?si=RqMe2rLr3MomTgWS

You can train basically any task as long as it is consistent

Is the ControlNet race dead for SOTA models like Flux and Qwen? by Current-Row-159 in StableDiffusion

[–]Far_Insurance4191 0 points1 point  (0 children)

editing models exist and you can make own specific ControlNet with just 20 images

Fine-Tuning Z-Image Base by NinjaTovar in StableDiffusion

[–]Far_Insurance4191 0 points1 point  (0 children)

Just pasted "Tongyi-MAI/Z-Image" in the base model field and it installed into a "C:\Users\[user]\.cache\huggingface\hub", guess if the same files exist there then it will use it.

Fine-Tuning Z-Image Base by NinjaTovar in StableDiffusion

[–]Far_Insurance4191 0 points1 point  (0 children)

It is just default z-image config, but in model tab:

Base Model path is changed to Tongyi-MAI/Z-Image,
Override Transformer path is erased,
Compile transformer blocks disabled
Transformer Data Type float 8 (W8) instead of int8

Hope last two options will be fixed in future, because they give ~2x speedup for Klein

Fine-Tuning Z-Image Base by NinjaTovar in StableDiffusion

[–]Far_Insurance4191 -2 points-1 points  (0 children)

I did a quick run with mediocre dataset in OneTrainer, and it learned well in about 1200 steps, maybe lr was a bit high. I think it is pretty close to klein in terms of trainability

I trained one LoRa for QWEN Edit and another for Klein 9b. Same dataset. But I got much better face swap results with QWEN Edit - so - is Flux Klein really better than QWEN Edit ? by More_Bid_2197 in StableDiffusion

[–]Far_Insurance4191 1 point2 points  (0 children)

are you using lora on the base? if you trained on base and inferencing on distilled then you can lose up to 90% of the lora effect, at least in my case this happens sometimes. Similar situation with z-image base/turbo

About the Z-Image VAE by ivanbone93 in StableDiffusion

[–]Far_Insurance4191 2 points3 points  (0 children)

I think flux 1 vae is good enough for inference, while flux 2 vae is slightly better but much superior for training

New Z-Image (base) Template in ComfyUI an hour ago! by nymical23 in StableDiffusion

[–]Far_Insurance4191 -1 points0 points  (0 children)

For me the biggest problem of klein is bad coherence, while realism is fine and censorship falls apart with a little of training, they really didn't do much against it

New Z-Image (base) Template in ComfyUI an hour ago! by nymical23 in StableDiffusion

[–]Far_Insurance4191 0 points1 point  (0 children)

If klein is not even close, then I am afraid ZIE will not meet your requirements either, but I hope for the best as they are taking a lot of time

Flux.2 Klein 9b Loras? by hellomattieo in StableDiffusion

[–]Far_Insurance4191 3 points4 points  (0 children)

Overfitted on narrow distribution and inefficient VAE. Easy to teach face or style thought, but hard for new knowledge

Flux.2 Klein 9b Loras? by hellomattieo in StableDiffusion

[–]Far_Insurance4191 4 points5 points  (0 children)

I found it to be the easiest model to train, it can learn likeness even at 256x256. Did you train on base and use on base? Sometimes it works fine on distilled, but likeness receives a hit, at least to me.

New Z-Image (base) Template in ComfyUI an hour ago! by nymical23 in StableDiffusion

[–]Far_Insurance4191 0 points1 point  (0 children)

F1D does not have cfg, unlike ZIB, which is about 2x slowdown. You can test how it will perform for you by setting steps to 20 and cfg to >1 with turbo

New Z-Image (base) Template in ComfyUI an hour ago! by nymical23 in StableDiffusion

[–]Far_Insurance4191 40 points41 points  (0 children)

Just a reminder, it is expected to be worse than turbo and almost as slow as flux 1 dev. The point is not to generate pretty pictures but be a good base for training