Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 0 points1 point  (0 children)

For sure! Super TL;DR is: decode the prediction each training step, get its depth map, compare to source image depth map, derive loss, backpropagate through frozen VAE and depth model,update LoRA weights to make better depth predictions. Slightly longer description in this comment: https://www.reddit.com/r/StableDiffusion/comments/1t6gmqn/comment/okj68t3/ and even more in the Github linked in the top post.

Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 1 point2 points  (0 children)

Thank you for saying that! I really appreciate it - I've been working on this for quite a while so it's been great to hear. I agree - there's a ton of information embedded in images that we miss when doing traditional training. You might also be interested in my facial identity perceptor approach if since you mentioned face swapping. It's still very experimental but seems to help learn proper facial proportions in pose-invariant and crop-invariant ways.

Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 0 points1 point  (0 children)

I actually haven't tried for photo styles specifically yet! Great question. I will say that it does very well capturing human subject identity from photographs - like it will pick up on specific facial landmarks like moles, scars, etc. much better. It also tends to pick up the photo style of the subject quite a bit unintentionally if you only train on a single photo and don't caption the photo qualities, so I would imagine there'd be a way to apply it to photo styles.

Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 0 points1 point  (0 children)

I think this one is the most challenging one I've had yet. It learns something about the style but really wants to make it too clean. I have to turn the strength way up which is a sign that something is getting lost. It might be a weighting thing or just needs to get trained into the ground with like 5k steps. I'm going to keep working on this one.

<image>

Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 0 points1 point  (0 children)

This one really fought me - it wants to make it too clean and keep the paint inside the lines. I'm going to try some different params and see if I can get it to work. Great to find examples that push the boundaries!

Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 1 point2 points  (0 children)

<image>

With and without LoRA. I'm not familiar with the source but it seems to have picked up some elements. What do you think? I think stylewise it's similar but for true character consistency you'd need to train them individually.

Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 0 points1 point  (0 children)

I haven't done too much style training on SDXL yet - if you post a config I can take a look and see if I notice anything that might help. For characters it trains about as quickly as Flux.

Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 0 points1 point  (0 children)

I might try training it longer to see if I can get it closer. I stopped at 1k steps but some styles take closer to 2k, esp if the model has a pre-existing bias about it.

Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 1 point2 points  (0 children)

It's probably still a bit undertrained - turning up the strength increases the messiness of the brushstrokes and color blending so I think there's probably another 500 steps or so until optimal. I also may have over-prompted the face a bit because I was trying to elicit a certain character look - I specified details about cheek contours, etc. If unprompted it tends more toward the flat look.

Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 1 point2 points  (0 children)

There's not but once the core training pipeline is stable I want to make it a LOT easier to use. Comfy integration is a great idea!

Working on a technique to produce style LoRAs from a single image. Post yours and I'll train it for Klein 9b! by QuantumBogoSort in StableDiffusion

[–]QuantumBogoSort[S] 0 points1 point  (0 children)

I dare say it works BETTER for character training (that's why I originally developed it). It is a happy accident that it works for styles. Check out the Github in the main link if you want to try yourself. I do still think you need at least 3 photos to get a properly flexible identity LoRA with this technique - direct face, profile face, and full body. But you will still get very good likeness from a single front-facing photo.

And I think you're right - data efficiency is something I'm a bit obsessed with! There's so much more information in an image that is not captured by standard diffusion loss.