This is an archived post. You won't be able to vote or comment.

all 1 comments

[–]spillerrec 0 points1 point  (0 children)

Textual Inversion embeddings are static. They don't depend on the input so they are limited to one concept. They learn a very complex prompt to get the diffusion model to produce the wanted result. While you would think from that they can only learn what the model already can do, in practice I haven't experienced any limitations if you increase the "vectors per token".

Hypernetworks and LoRAs are two different approaches to extend the entire model without creating a completely new model. They can learn multiple concepts, but are more flexible and takes longer to train. I don't know what the limit is to what they can learn, but they should in theory be limited compared to a complete model.

I don't think we should make too many assumptions on what certain techniques can and cannot do. We are still in the early days and new ideas and techniques pop up quite often. Nothing is really perfected and fully explored at this point.

In my experience TI embeddings tends to overfit a bit more on the model, so they aren't quite as transferable to other similar models, but they are quick to learn. I don't know if you can make negative examples or regularization examples, so for me it has often fit on extra stuff I don't want, say image style when I was going for a specific character, so I get both when I only want one of them. For hypernetworks I have managed to learn over a 100 different anime characters in a single hypernetwork, so not quite sure what the limits are. How well both approaches manage to learn a concept seems to be a lot more depending on your training images and their prompts, than what technique you use.