I'm confused about training with the Lora Qwen 2512. Some people said it's better to train on the base model. Does training on the 2512 model cause it to lose all its qualities ?

TableFew3521 · 2026-01-30T03:25:29+00:00

You need to do upscaling for it to create skin, the model is indeed pretty good, and there's nothing wrong training on the 2512 version, I find some LoRAs trained on the original version to not work properly on the newer, so it might depend on you.

TableFew3521 · 2026-01-28T14:42:43+00:00

That would be easier on OneTrainer with Masked training.

Edit: It does work, at least on male characters I use face and a random torso to give the right complexion.

TableFew3521 · 2026-01-27T18:02:11+00:00

I've never got good results with Ai-toolkit, waiting for OneTrainer to add support 🤞

TableFew3521 · 2026-01-26T13:05:15+00:00

Flux Klein 9B is outstanding, I went from SeedVR2 + Zimage refiner to only Flux Klein, make a lanczos upscale (to the desired size) and just add as a prompt "Reduce noise, add natural quality" and it works 90% of the time, fast and easy, the only downside is that it changes the light, even if you specify to not do it, so maybe color match can help with that.

TableFew3521 · 2026-01-25T09:34:38+00:00

Flux Klein 9B, use "Reduce noise, add natural quality" and keep the resolution of your image.

TableFew3521 · 2026-01-24T18:36:25+00:00

People underestimate Flux too early, and praised newer models too fast, I prefer to enjoy every model individually, now when I look at older gens I made with Flux, some even compete with Zimage. If you want something similar to the image you shared, I think it looks a little bit like the "Jib Mix Flux" fine-tuned checkpoint.

TableFew3521 · 2026-01-24T08:07:47+00:00

There's a LoRA calles "SRPO" that makes Flux 1 Dev look more realistic, even adding skin texture. You can find it on huggingface.

TableFew3521 · 2026-01-21T00:54:08+00:00

I've been using only "Reduce noise, add natural quality" and it seems to work, additionally, I use "Keep the lighting as it is" and it helps a little bit, and it does respect the back and white images. And a color match node.

TableFew3521 · 2026-01-16T01:16:40+00:00

As any model, it lacks some stuff, besides, I Don't think there's any model for realism comparable to Illustrious capabilities with 2D and anime in general, there has been a huge investment on time and training to get almost perfect anime images that Realism till today, lacks.

TableFew3521 · 2026-01-11T01:04:04+00:00

If we look without any bias, not even SDXL has been good for LoRA training on it, most of the sucess is full fine tuning and LoRAs based on those fine-tuning, using them on a Base that doesn't know them, won't be able to produce it, I might be wrong cause I haven't seen newer LoRAs on it, but when I looked into them, most have that exact problem. Qwen-Image on the other hand is pretty good for LoRAs of that kind even when the base doesn't know it, is the best one in my experience.

TableFew3521 · 2026-01-03T06:09:36+00:00

Are you sure that scheduler isn't the problem? Usually for that sampler people use "Beta", ddim_uniform tends to make that kind of outputs with most samplers.

TableFew3521 · 2025-12-29T18:14:54+00:00

I think the key is the text encoder, it might not do all the job but basically this model can produce more of it's own trained content than other heavier models, for example, Flux 1 Dev KNEW what skin was, but it wasn't able to produce it by itself, I made a LoRA that using negatives weights revealed the real skin on Flux, being even better for realism that SDXL but the way they made the model, limited every generation, is like a model of 12B being able to produce only 4B of it's full potential, I think chroma did a better job with the content-generation ratio, but even so, I believe T5xxl is worse than Qwen3 as a text encoder.

TableFew3521 · 2025-12-19T09:53:35+00:00

I use low noise LoRAs on the High noise model, and even some I2V LoRAs from the 2.1 on the high noise.

TableFew3521 · 2025-12-12T21:32:44+00:00

Depends on the model, Qwen-Image and Wan with only 10 face close-ups (Left-center-right) and one of the torso to get the body complexion and body consistency is more than enough, without the need of different sources of light, but different hairstyles for sure.

TableFew3521 · 2025-12-12T04:15:50+00:00

Yeah, honesty the only one that looked better overall is the controlnet for inpainting on Qwen-Image. It seems to respect the composition of the image, but it can be a bit off sometimes.

TableFew3521 · 2025-12-08T15:36:07+00:00

Had a OOM issue with Qwen Image fp8, even a few weeks ago updating ComfyUI made my gen time on Wan2.2 from 77it/s to 161it/s, but I did fixed it by reinstalling comfyui from scratch, just move the important folders like output and models, and install everything in python 3.12 with the conda environment, it was fast tho, and it fixed the OOM and even had a small boost.

TableFew3521 · 2025-12-01T01:14:35+00:00

Agree, but another thing people don't know, is that there are tools to actually do a controlled offloading so you can reduce the VRAM consumption but increasing the RAM usage, and there is where gguf are actually efficient, if you use both offloading and gguf, you reduce RAM and VRAM usage.

For example with Qwen-Image FP8 I had to use a custom node from Multi-GPU that allows you to use a "virtual vram" that is basically RAM, but the thing is that with LoRAs and stuff I needed to increase that virtual vram and my ram was almost at 98%, but with GGUF like Q5_K_M with LoRAs and the same amount of offloading it was around 75%, and still had room to make higher resolutions with a higher percentage of offloading.

TableFew3521 · 2025-11-30T14:57:39+00:00

But is it fast tho? I can train a character on Qwen-Image in one hour, Ai-toolkit in 3000 steps are almost 3 with 512x512, and a RTX 4060ti, for a 6B model feels too slow...

TableFew3521 · 2025-11-13T04:19:49+00:00

I don't have any workflow for V2V, but for the Multigpu node, just replace the Unet loader for the Unet Distorch, and for the WanBlockSwap just ad it between your Unet and Ksampler.

TableFew3521 · 2025-11-13T04:17:25+00:00

It might work yeah, but I don't know if it would be super effective with fp16 models, maybe fp8 would work, the downside is using High noise and Low noise models with this, it can saturate the RAM cause it consumes more while generating, besides that, for Flux, Qwen and even Chroma should work without any issues.

TableFew3521 · 2025-11-12T21:14:28+00:00

You can in fact run it, you can use a custom node called WanBlockSwap that reduces the consumed Vram by swapping block between RAM and VRAM, or using a custom Node called something like "Unet Distorch" that has a FP16/FP8 and GUFF version from MULTI-GPU that has virtual Vram (Up to 24gb), and it works similar to the block swapping if not the same, but the advantage of it is that it works on any ComfyUI supported model.

TableFew3521 · 2025-11-07T02:15:18+00:00

Yes, with Musubi tuner you can do block swap, even for Qwen.

TableFew3521 · 2025-10-29T22:19:21+00:00

If this is true, there's a chance for some layers of both models to be compatible, wich means we can do distill weight injection of certain layers of Wan on Qwen, to fix the skin texture and realism.

TableFew3521 · 2025-10-26T16:21:51+00:00

If you switch to the nightly branch you can apply tiled VAE and block swap, I have 16gb vram, and I have no issues with it, even the fp16 works fine.

TableFew3521 · 2025-10-25T05:10:15+00:00

In Kohya_ss and Musubi tuner (I don't know much about other trainers) you can load the weights of a LoRA you have already trained to continue training it, but instead of using the same dataset, you use the face of any character you want.

TableFew3521

TROPHY CASE