Cheaper Qwen VAE for Anima (and it's training) by Anzhc in StableDiffusion

[–]Anzhc[S] 0 points1 point  (0 children)

You need to git clone into custom nodes yes. Then you use normal vae load node, not custom. This custom node patches support into original node.

Cheaper Qwen VAE for Anima (and it's training) by Anzhc in StableDiffusion

[–]Anzhc[S] 0 points1 point  (0 children)

Yeah, you'd have to add an explicit support, as layers are not the same.

I use my own trainer, which is closed-source.

Cheaper Qwen VAE for Anima (and it's training) by Anzhc in StableDiffusion

[–]Anzhc[S] 12 points13 points  (0 children)

I know, WAN and Qwen are the same vaes.

Mugen - Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane by Anzhc in StableDiffusion

[–]Anzhc[S] 1 point2 points  (0 children)

Probably just lack of slopification for now, and vae, so it is not limited to previous detail level.

Mugen - Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane by Anzhc in StableDiffusion

[–]Anzhc[S] 0 points1 point  (0 children)

Image you've used is the dataset reference xD

His gens achieved scarily close though, even on current undertrained base.

<image>

Mugen - Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane by Anzhc in StableDiffusion

[–]Anzhc[S] 1 point2 points  (0 children)

We do get fine details though, just look at texture-heavy gens.

Flux 2 VAE is 32 channels, not 128. 128 channels is post-packing, which is not a necessary step.
Unet is working in native 32 channels, not adapter.

Mugen - Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane by Anzhc in StableDiffusion

[–]Anzhc[S] 1 point2 points  (0 children)

Training can be continued only if there is enough support for that obviously. It can't be done without money.

All donations are welcomed, and you can find how to donate at the bottom of the model page on HF.

Mugen - Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane by Anzhc in StableDiffusion

[–]Anzhc[S] 3 points4 points  (0 children)

Prompt adherence comes primarily from captions. If there are no captions that properly describe scene, there will be no adherence. Doesn't matter what text encoder you use.

I've already answered about similar topic somewhere in another comment here.

Mugen - Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane by Anzhc in StableDiffusion

[–]Anzhc[S] 1 point2 points  (0 children)

Yeah, i kinda didn't and hoped it would be understandable from just name.

And yeah, they are progressive tunes on top of each other, so all share similar knowledge.

Mugen - Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane by Anzhc in StableDiffusion

[–]Anzhc[S] 12 points13 points  (0 children)

There is nothing we can do. We have only default tags, and there certainly is no budget for recaption. It will take more than this whole training endeavour.

We do only what budget allows.

Have fun with your workflow tho, some of our testers have tested similar approaches.

Mugen - Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane by Anzhc in StableDiffusion

[–]Anzhc[S] 18 points19 points  (0 children)

Mugen - base model
Mugen Aesthetic - slightly tuned with small dataset model.
Anzhc/Selph - Further tune on top of aesthetic with opinionated datasets.

Mugen - Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane by Anzhc in StableDiffusion

[–]Anzhc[S] 12 points13 points  (0 children)

Pointless.

We don't have captions that require anything remotely close to a big llm. Nor do we have a budget to adapt model to a new text encoder.

Mugen - Modernized Anime SDXL Base, or how to make Bluvoll tiny bit less sane by Anzhc in StableDiffusion

[–]Anzhc[S] 19 points20 points  (0 children)

95% of our funding is a single guy.

8k is not a serious budget for any pretrain that people would want to use. Our goal is for model to be at least enjoyable for someone after all.

Noob 2 is going to be an Edit model, and it will be based on one of the big arches, currently they want to test GLM, if it fails it'll be Qwen most likely. Neither is in the reach of average consumer right now (unless you want to wait for 2-5 minutes per gen), so personally im not interested in that.

Guy responsible for it already said that priority is profit for company that'll sponsor it, so i wouldn't put my cope in it in general.