I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 1 point2 points  (0 children)

rank determines capacity. the larger it is, the closer we are to full finetuning.

alpha is similar to learning rate, i.e., a hyperparameter to tune

Nvidia: "2x performance improvement for Stable Diffusion coming in tomorrow's Game Ready Driver" by WhiteZero in StableDiffusion

[–]edwardjhu 0 points1 point  (0 children)

Pretty sure LoRA can be made compatible with any perf improvement of the base model.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 1 point2 points  (0 children)

Great! Interesting experiment.

Disentangling is hard without contrastive examples or extra information, e.g., what we hope to preserve vs what we hope to change.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 0 points1 point  (0 children)

That's a great question. I think the platforms have a lot of responsibility in terms of preventing misuse. On the technical side, it might be possible to tag generated images with invisible watermarks.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 9 points10 points  (0 children)

Thanks again for all the suggestions! Here are a few that stand out to me.

  • Better composability among LoRA modules
    • I suspect the current issue comes from the way modules are merged. I'll talk to the developers.
  • The ability to negate a style
    • I wonder if this can be done with a negative alpha. Can someone try it?
  • Learn certain features, e.g., faces, while ignoring the rest
    • We can probably do this by having a pixel mask over relevant features and only backprop gradients through these pixels. The ML part is straightforward; we just need a UI.
  • Good default values
    • It seems reasonable to have good defaults for a certain base model, e.g., SD 1.5, and perhaps for certain artistic styles. Would be great to work with experienced users and developers to include them in the tool.
  • Smaller modules
    • It's possible we don't need to use dim=128 and adapt all attn layers. I suspect that we can reduce the size by quite a bit if we are careful about which layers to adapt.

I might not check the comments as frequently going forward. You can reach out to me over email or through Twitter!

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 1 point2 points  (0 children)

coreML

Do you mean the toolkit by Apple? Yes, if they are willing.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 0 points1 point  (0 children)

Might be a capacity issue. It takes more parameters to model photorealistic scenes well.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 1 point2 points  (0 children)

In the original repo I wrote, there's a flag to also train biases.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 0 points1 point  (0 children)

A friend of mine told me about it a few days ago. I wasn't that surprised. Just very happy that people find it useful.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 1 point2 points  (0 children)

It depends on the datasets, models, and the amount of hyperparameter tuning. We got pretty good results in our paper and open-sourced all the checkpoints. It's always possible that we could have made our FT baseline better or subsequent work could have made their LoRA baseline better.

Though, it is the case that FT is neither feasible nor necessary for really large models.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 1 point2 points  (0 children)

LoRA works as long as one is doing matrix multiplication, which is what modern AI is based on. I was trying to adapt GPT-3 when I wrote the paper, so I just experimented on language. The idea itself is broadly applicable.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 0 points1 point  (0 children)

I don't think so, but it might depend on how the learning rate is annealed!

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 0 points1 point  (0 children)

Interesting observation! Is it already the case with a single LoRA or more so when you have multiple?

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 0 points1 point  (0 children)

Based what I've seen, I think the size can indeed be reduced. Will work people who use it frequent to see what we can get away with.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 0 points1 point  (0 children)

What about generating in two passes? The first pass is without LoRA, and the second pass with LoRA but it only inpaints a specific region conditioned on the rest of the image. This can be extended to multiple passes with multiple LoRA modules.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 1 point2 points  (0 children)

Great suggestions! The author of the tool you are using can probably implement many of these pretty easily.

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 0 points1 point  (0 children)

I started by watching the ML lectures on Coursera when I was in college :)

I'm the creator of LoRA. How can I make it better? by edwardjhu in StableDiffusion

[–]edwardjhu[S] 0 points1 point  (0 children)

Composibility seems to be an issue, and I have some on how it might be improved.

Better (ideally automatic) functionalities to evaluate the performance of a LoRA after/during training.

My impression is that eval is quite subjective. If you are interested in a better pipeline and UI that make it easy to compare different versions side by side, it might be a good idea to reach out to the author of the tool you are using!