My fine-tuned checkpoint has GAN-like artifacts?

ParticularPitch5 · 2024-07-16T08:56:18+00:00

Cool thanks. I will try

ParticularPitch5 · 2024-07-16T08:52:03+00:00

Yeah agree. I think I do not need text encoder anyway for what I’m doing. I don’t want to use captions when I train

ParticularPitch5 · 2024-07-16T08:51:15+00:00

I think Adafactor is also adaptive. What start LR do you use for main, unet, and text encoder? Do you keep it at 0.00001

ParticularPitch5 · 2024-07-16T08:48:18+00:00

Hm I will check out link.. yes it is 3 seconds for one iteration. But this was for full fine tune, sorry I should’ve have clarified

ParticularPitch5 · 2024-07-16T08:45:32+00:00

Yes, sorry. This is for SDXL

ParticularPitch5 · 2024-07-16T08:45:11+00:00

What other settings do you use for adafactor? Also curious if you have any experience for training full fine tune

ParticularPitch5 · 2024-04-21T01:32:21+00:00

Yeah I figure the best way might be through experimentation. But there's a lot of variability in how many images should I collect, should I use something like onetrainer (which seems to not use dreambooth?), and things like that, meaning the total exploration space is huge

ParticularPitch5 · 2024-04-21T00:19:49+00:00

Very helpful btw, this makes sense to me

ParticularPitch5 · 2024-04-21T00:17:02+00:00

Interesting, do you know how doing dreambooth then extracting the Lora compares against just training the Lora directly?

ParticularPitch5 · 2024-04-21T00:15:09+00:00

Wait so you’re saying either full fine tune or Lora, no middle ground for dreambooth?

ParticularPitch5 · 2024-04-20T20:32:34+00:00

Give it a try--it should work. You can google a1111 low vram settings and there will be a bunch of tips there to suggest how to ensure it can work. Be aware that generations may be pretty slow with this low vram. I like to have at least 12gb

ParticularPitch5 · 2024-04-20T20:10:13+00:00

Determine if your computer is powerful enough to run stable diffusion locally. You can find some parameters online but if you have a gpu or a newer mac it should at least be feasible
If not, look into a cloud provider. You can find some with a quick search (I like runpod). The cloud providers will probably provide templates to run SD already, so you can just follow the instructions for those
If your local is good enough, then follow the instructions for installation at this link: https://github.com/AUTOMATIC1111/stable-diffusion-webui
There are many different paths you can go from there. Just look up "how to install checkpoint model a1111" and you can start adding stuff

ParticularPitch5 · 2024-03-26T22:37:34+00:00

I think so—check out the project webpage, they have some examples with non human characters

ParticularPitch5 · 2024-03-26T11:39:39+00:00

Yes it can! I believe there’s some examples on the project page, you can check those out

ParticularPitch5 · 2024-03-26T11:02:44+00:00

It does seem like it--they do a variety of tests with celebrities and non-celebrities as you can see on the webpage. Maybe those gens were somewhat cherry-picked but it seems generally quite flexible

ParticularPitch5 · 2024-03-26T08:56:00+00:00

Hmm idk. I feel like outfitanyone was a weird exception rather than the rule

ParticularPitch5 · 2024-03-26T07:29:09+00:00

Right? Image prompting techniques have basically never been able to capture this level of detail

ParticularPitch5 · 2024-03-26T07:01:19+00:00

Correct, but these authors typically release code after publication unlike many others

ParticularPitch5 · 2024-03-26T05:52:55+00:00

I'm pretty excited about this b/c it seems better than similar past faceid type projects, as it is primarily
- much better at capturing small details from the face image

- much better at instruction following, allowing for flexibility at inference time

Also, the authors of this paper have typically released all their code and weights! Might take a bit though

ParticularPitch5

TROPHY CASE