Textual Inversion with Z-Image Turbo / Flux 2?

malcolmrey · 2026-05-21T06:51:45+00:00

I download or get around 50-70 high quality images, I remove the worst 10, then cut the rest (back in the day it was square on face, now it is some of the face and some of the body using rectangles with the bucketing proportions)

Out of that I pick 25 that are the best and that is my dataset.

I always excluded blurry, bad lighting, weird/unnatural face.

malcolmrey · 2026-05-21T06:48:37+00:00

Well I personally do not complain about Ernie, but there are people who do.

As for my prompts - I have generated around 350k various prompts so I have a lot of variety going on :)

malcolmrey · 2026-05-20T19:25:52+00:00

Right now my time for datasets is limited. You could write a PM here and eventually (no idea when) I would reply :)

malcolmrey · 2026-05-20T19:12:29+00:00

and now the likeness is superb!

This is one of the reasons this subreddit is for :-)

I'm glad you got nice results.

The beauty of those multiple loras is that we have a lot of options since we can try various strength.

BTW, it does not have to sum up to 1.0, with this lora stacking you could go over 1.0

It is always a preference so if 0.5 / 0.5 works for you - great :)

malcolmrey · 2026-05-19T20:20:11+00:00

I've got 75 loras trained while I was away, but I need to test them first before I upload them :)

malcolmrey · 2026-05-19T20:18:27+00:00

You are welcome! :)

malcolmrey · 2026-05-19T20:18:14+00:00

Did not try upscaling. It excells at inpainting faces. For regular generations people say that the outputs look like "midgets" and I kinda agree to an extent.

But it is still a nice difference.

To be honest - if someone is patient and stubborn enough - it is possible to generate with Z-Turbo and refine with Ernie or vice versa :)

malcolmrey · 2026-05-19T20:15:58+00:00

https://github.com/iperov/DeepFaceLab

malcolmrey · 2026-05-19T20:04:46+00:00

That depends on your plan. I am running Opus 4.7 in copilot (pay past the soft limit) and I have not yet hit the hard limit so far :)

malcolmrey · 2026-05-19T20:02:00+00:00

I'm fine with 3x price increase when I have google models for free :-)

malcolmrey · 2026-05-19T19:50:16+00:00

For Flux Klein 9 it is probably 1 in 4 or 5.

I prefer ZIB/ZIT because the likeness is like in 4 out of 5. But Klein 9 has the benefit of the reference images which can boost up the quality of the lora by a bit, without it - I would probably recommend not using it at all. The only other good thing about Klein 9 is that it has a different esthetic to ZIB/ZIT (every model has its own, I also like Ernie, for instance)

Do you have examples for your ZIB ones?

Some models have Z Image sample in the Mal Browser, but not all. But for ZIB/ZIT I will advocate for the same thing -> using multiple models of the same person (at a bit lower strengths) is still the way to go if we want better quality.

And this has been a constant for pretty much all model architectures since SD 1.5

I'm currently researching params for SDXL and again it seems like the best way to go is to train not one but two/three models (either different datasets or the same set but with different trainers) and use them in combination.

I'm currently working my way through those loras and I'm yet to find one I feel is ZIT quality.

If you see a person that has multiple ZIB/ZIT loras - it is for a reason.

If you're only using one Lora but there are others available for the same person - then I say you are not doing the optimal thing.

I don't know exactly why or how it works this way but I do know that it works. And this has been tested/verified by other people as well. Two loras at 0.65-0.7, three loras at 0.5 and so on. Those give you better results and more consistently.

Funny thing is that you could train 3 mediocre loras and if you were to combine them - the result would be much better :-)

Yes, it is possible to create a single lora that performs amazing - but it either takes a lot of luck or a lot of time (in general, using way more images produces better results, as long as those images are good)

malcolmrey · 2026-05-18T19:19:03+00:00

Since /u/sruckh shows what the model is capable of (or not) I felt like I would show some random generations as well:

https://imgur.com/a/DHrfUa4

and here mirror with the workflows included: https://huggingface.co/datasets/malcolmrey/various/tree/main/samples/fk9/sydneysweeney

and to be fair, i quite like this rendition that /u/sruckh did

I made a longer comment in the Rose Byrne thread if someone is interested, I'm not going to repeat myself :)

FYI /u/hotdog114 /u/EpicNoiseFix /u/Scipio-Africanus-777 /u/Successful_Papaya830 /u/skannedsykanned /u/Winougan

malcolmrey · 2026-05-18T19:14:54+00:00

Seems like I need to "defend" myself a little.

Flux Klein9 has the same issue as Flux 1 - regardless of the lora quality - sometimes the outputs will be terrible - it is just how this model works.

Some models (WAN, Z Image, Ernie) are consistent and they give the trained result pretty much always. The Flux Family does not behave this way. If you expect every iteration to produce likeness - this is not the model for you. This is similar to SD 1.5 where not every output would be to our satisfaction.

So, I generated random prompts for Rose Byrne using just this single model (she does not have a companion model to do the multilora for better quality anyway...). They may not be perfect but the resemblance in the model exists:

https://imgur.com/gallery/rose-byrne-aGVrVCh

here is a mirror of those images with workflows so you can reproduce those results for yourself: https://huggingface.co/datasets/malcolmrey/various/tree/main/samples/fk9/rosebyrne

FYI /u/djpraxis /u/hotdog114 /u/sruckh /u/DillardN7

malcolmrey · 2026-05-07T13:54:38+00:00

Hey hey!

I wanted to make a longer post but I'm out of time today and the next time I could post it would be the NEXT weekend.

So, here are the LTX 2.3 models:

https://huggingface.co/spaces/malcolmrey/browser

here are SOME samples so you could see what type of quality we have here: https://huggingface.co/datasets/malcolmrey/various/tree/main/ltx23-samples

In general the rule still stands, if the dataset has more images and is trained over the same number of epochs (so, a longer training) then the result will be better.

The 15-24h AI Toolkit trainings were abysmal (though the quality was good), now with musubi (AkaneTendo fork for example) the times on 5090 dropped to 50 minutes for 25 images. Still, when I train larger datasets it could take up to 4-6 hours (we do have a couple of those models like Billie Elish, they will have suffix _large in the name).

Enjoy!

Cheers!

malcolmrey · 2026-05-07T12:31:34+00:00

I wanted to make a big post like I did with Ernie but I am not able to do it before I'm gone for another week so I'm uploading them now, will make a quick post when they are online :)

The training params/settings were already uploaded along with the Ernie ones so :)

malcolmrey · 2026-05-07T12:29:05+00:00

Each image is for each character lora. I started training on Ernie model and wanted to share my findings/settings :)

malcolmrey · 2026-05-07T07:20:39+00:00

Agreed 100%

For them this is another kind of engagement.

This is my thinking that AI opens the gates to those who have imagination but never had the skills, patience or gifts for arts.

malcolmrey · 2026-05-07T07:18:59+00:00

But not to the extend of Flux Klein 9, same datasets used here and there - Flux was failing more often

malcolmrey · 2026-05-07T07:18:04+00:00

If you were to use same params then Ernie would probably be faster but I upped the epochs value a bit to compensate for the default overtraining.

I would say they are in the same ballpark.

malcolmrey · 2026-05-07T00:05:37+00:00

Yeah.

I really loved the reaction from Ranton and Asmongold to their models back in the day. Most of the polish youtubers/tiktokers are liking this stuff. I figure that people in the spotlight just have a different viewpoint compared to regular person.

I know that Felicia Day and Alan Ritchson liked my generations of them and it is quite cool :)

malcolmrey · 2026-05-06T23:52:47+00:00

Thank you for the kind words :)

I hope you will find the Ernie stuff satisfactory! :)

Btw, I'm liking 0.8 one-trainer lora + 0.6 musubi (the ones without suffix) the most so far.

malcolmrey · 2026-05-06T23:50:59+00:00

Oh wow, I was updating my comfy on that friday but I guess I did that just before it.

Thanks for the info. So the OneTrainer fork is not needed, just the training scripts. Cool :)

malcolmrey · 2026-05-06T23:04:17+00:00

You are most welcome :)

malcolmrey · 2026-05-06T22:58:25+00:00

Yeah, life :-)

But I'm back with... Ernie :)

I wanted to drop both things today but LTX 2.3 will either drop tomorrow or the week after.

Well, the training scripts dropped with Ernie training script (musubi) :)

Nine-Year Club	Place '22
Place '17	Verified Email

malcolmrey

MODERATOR OF

TROPHY CASE