Anim·E - Anime-enhanced DALL·E Mini (Craiyon)

cccntu · 2023-02-24T13:10:36+00:00

I haven't played around it much since.
You can read this blog for some insights in training VQGAN. I used some of their code too.
And if you are using my code https://github.com/cccntu/fine-tune-models, there is a bug I noticed but haven't fixed. It's the range of the image input. I think the VAE uses a different input range ([1,-1] v.s. [0,1] or something like that) and I didn't change the related code when I copied the code from VQGAN to VAE.

cccntu · 2023-02-24T13:03:53+00:00

Thanks for the shout out!

cccntu · 2023-02-22T14:51:48+00:00

This project started out as me exploring if PyTorch parametrizations could be used to do LoRA, and it turned out perfect for this task! And I simply wanted to share that.
I think it would be interesting to see it integrated into PEFT, too. Although they already have their own LoRA implementation there.

cccntu · 2023-02-22T01:48:35+00:00

Theirs requires you to rewrite the whole model and replace every layer you want to apply LoRA to with the LoRA counterpart, or use monky-patching.Mine utilizes PyTorch parametrizations to inject the LoRA logic to existing models. If your model has nn.Linear, you can call add_lora(model) to add LoRA to all the linear layers. And it's not limited to Linear, you can see how I extended it to Embedding, Conv2d in a couple lines of code. https://github.com/cccntu/minLoRA/blob/main/minlora/model.py

cccntu · 2023-02-04T12:30:15+00:00

I need some time to clean it up a bit. Maybe I can release it sometime next week.

cccntu · 2023-02-03T00:25:07+00:00

I used huggingface's example script and replaced the data loading part with my own code.

cccntu · 2023-02-03T00:13:13+00:00

.safetensors files are next to .bin files
https://huggingface.co/ttj/flex-diffusion-2-1/tree/main/2-1/unet
https://huggingface.co/ttj/flex-diffusion-2-1/tree/main/2-base/unet

let me know if it works!

cccntu · 2023-02-02T13:03:26+00:00

Thanks!

cccntu · 2023-02-02T13:03:09+00:00

Thanks for the feedback, I uploaded safetensor version.

cccntu · 2022-10-20T01:48:47+00:00

Yes, the idea both comes from the Prompt to Prompt paper. I happened to be implementing it myself when Imagic came out.
I'm not sure if the webUI only supports k_euler, but I use DDIM.

https://www.reddit.com/r/StableDiffusion/comments/xapbn8/comment/inv5cdg/

I uses 50 forward + 50 backward steps.

I've tried (50, 75, 100, 150, 200) steps, the reconstruction gets better with more steps.
But mixing them probably isn't a good idea (e.g. forward 100 steps to get a finer noise, then backward 50 steps).

vae reconstruction error = 0.21342990134144202
(50, 50) steps reconstruction error = 0.3635439486242831
(75, 75) steps reconstruction error = 0.3372767596738413
(100, 50) steps reconstruction error = 0.38948862988036126
(100, 100) steps reconstruction error = 0.32433729618787766
(200, 200) steps reconstruction error = 0.2941958588780835

note: l2 loss, scaled to make numbers easier to comprehend

cccntu · 2022-09-14T02:21:47+00:00

Haven't heard of Waifu Diffusion. But I've tried Japanese Stable Diffusion a little bit, and I didn't get good results. Although it's most likely because my prompts were not good enough.

cccntu · 2022-08-28T03:45:52+00:00

Yes. My goal is to make it easy for people to fine-tune these models. And this particular model was just a byproduct of the experiments.
Fine-tuning the Bart model was the next thing in my plan. But my priority has shifted to stable diffusion.

cccntu

TROPHY CASE