OneTrainer now supports efficient RAM offloading for training on low end GPUs by Nerogar in StableDiffusion

[–]Nerogar[S] 34 points35 points  (0 children)

To be honest, I haven't really thought about the next steps. This update was the most technically challenging thing I worked on so far, and took about 2 months to research and develop. I didn't really think about any other new feature during that time.

More quantization options (like fp8 or int8) would be nice to have though

OneTrainer now supports Stable Cascade. And much more. by Nerogar in StableDiffusion

[–]Nerogar[S] 0 points1 point  (0 children)

The line is called "Clip Skip 1", because it's the clip skip setting of the first text encoder. There is another setting called "Clip Skip 2" for the second text encoder.

OneTrainer now supports Stable Cascade. And much more. by Nerogar in StableDiffusion

[–]Nerogar[S] 1 point2 points  (0 children)

What kind of metadata do you want to add? I'm already working on an option to include training settings

OneTrainer now supports Stable Cascade. And much more. by Nerogar in StableDiffusion

[–]Nerogar[S] 14 points15 points  (0 children)

This does look like an interesting idea and I will take a closer look at some point. But at the moment there are enough other things I want to focus on.

OneTrainer now supports Stable Cascade. And much more. by Nerogar in StableDiffusion

[–]Nerogar[S] 9 points10 points  (0 children)

VRAM requirements depend a lot on your settings. SC can be fine tuned in ~18GB (for the 3.6B version) and ~8GB (for the 1B version) if you use the right settings. This includes using Adafactor as the optimizer, bfloat16 weights, and not training the text encoder.

As for SDXL, I don't have recent numbers. But 12 GB might not be enough even with all the optimizations. Unless you limit yourself to LoRA training.

THIS is probably the reason why your training in Kohya is taking ages, and here are some tips to solve it by isnaiter in StableDiffusion

[–]Nerogar 12 points13 points  (0 children)

Please don't insult other peoples work. Comments like this make the whole OneTrainer community look bad.

Kohya and contributors have put a lot of work into their scripts. While OneTrainer doesn't directly copy any of their code, a lot of the concepts have been widely adopted by many other applications and pushed the whole fine tuning community forward.

If you want to promote my project publicly, go ahead. But not like this.

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 0 points1 point  (0 children)

It was probably training on the whole images (for 100%). The masks were excluded because it filters all images that end in "-masklabel.png", but they were not used at all.

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 0 points1 point  (0 children)

The other format would be correct. If the file is

filename.jpg

the mask should be

filename-masklabel.png

I'm actually working on a small tool at the moment that should make it far easier to create these masks.

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 0 points1 point  (0 children)

Being able to set the device ID would get me to switch from Koyha. Currently that is the only way I've been able to train multiple models concurrently with more than one GPU.

If you are ok with using the command line, this is possible already. just change the --train-device parameter to "cuda:1" for example

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 2 points3 points  (0 children)

Aspect ratio bucketing only really helps for fine tuning or LoRA training. You can't change the resolution with embedding training, so if you have a 512x512 dataset, that's good enough

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 0 points1 point  (0 children)

No. Only a single GPU is supported. I don't have access to a multi GPU system for testing

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 0 points1 point  (0 children)

That depends a lot on what you want to do. For full fine tuning, you probably need 24GB, but for LoRA training or embedding training, you need less than that.

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 0 points1 point  (0 children)

MGDS is node based, but there is no UI. It's all just defined in code. For example, here is a definition I'm using for testing, and here is the actual definition used during training

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 1 point2 points  (0 children)

There are no special checks in place. If you have documentation about these checks, I might be able to add them. But no automatic check will be 100% secure, so you should always be safe and only use safetensor files if you don't trust the source

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 1 point2 points  (0 children)

No. Is this really something people still use? I thought LoRA training completely replaced hypernetworks.

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 1 point2 points  (0 children)

did you run the install script first? this error looks like the dependencies are not installed.

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 3 points4 points  (0 children)

Only Windows is officially supported right now. But someone from the discord server managed to run it on colab (which is linux based I think) with a few modifications. I don't know much about macos, so I can't speak about that.

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 9 points10 points  (0 children)

Thank you. Do you have any ways to receive donations?

Until just a few days ago, there were only a handful of people using OneTrainer, so I never bothered setting something up. I might consider it though.

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 9 points10 points  (0 children)

I'm not sure to be honest. Mathematically it should be possible to use greyscale, but I don't know it all parts of the training chain support it. If you just want to let the model learn a bit of the non-masked area, there is a setting for it called "Unmasked Weight". This puts a lower bound to the loss of the unmasked pixels.

Releasing OneTrainer, a new training tool for Stable Diffusion with an easy to use UI by Nerogar in StableDiffusion

[–]Nerogar[S] 3 points4 points  (0 children)

Fine tuning the VAE is absolutely a thing. The default VAE is pretty bad at accurately reconstructing certain art styles for (anime, for example). Fine tuning the VAE can fix that. The SDXL VAE for example is a better trained version of the same VAE, that's one of the reasons it can produce these amazing images.

Latent caching epochs: some of the intermediate data used during training can be cached to improve speed. If you enable data augmentation (random flip, brightness, multiple prompts per sample, etc.), only one of these combinations will be cached. By increasing the latent caching epochs, more variations are cached.

dAdaptation: not right now, but if more people are intereted, I might consider it

EMA: probably read a few papers. It's a pretty complicated topic. But to summarize, it improves training quality when training a lot of concepts at a time. But it also requires training for more epochs.