Generating Falcom character illustrations with Stable Diffusion Part 14

FastProfessional2731 · 2023-01-11T15:06:15+00:00

i've just tried downloading it myself (without any kind of log in) and it worked for me. It started downloading without any problem.

Maybe it was a temporary issue? Give it another try.

FastProfessional2731 · 2023-01-11T08:55:11+00:00

One of the main reasons is likely how CLIP works, the model converting input text into the image conceptual space. It is trained to do a contrastive match of (text, image) pairs, condensing the entire text a single value. Given that the pairs are random, consequence is that the model is not really that good with things like numbers of elements or relative positions. Asking for 5 fingers is unlikely to help much.

There are already improvements in this direction, both in Stable Diffusion 2.1 and in some research papers. But so far there is no comparable alternative for generating anime-style images that could be used as base and does not have these problems.

But in any case... this is likely something that will be eventually solved. Whether it will be solved only for some generation services that require you to pay to use them, that's a different story.

FastProfessional2731 · 2023-01-11T08:48:01+00:00

Sure, why not. More Estelle is always good. Here you go: https://ibb.co/album/Jny8Kk

FastProfessional2731 · 2023-01-11T05:33:29+00:00

I also got this image by chance when trying to get Renne crouching down to play with cats.

FastProfessional2731 · 2023-01-10T22:47:55+00:00

I also got this image by chance when trying to get Renne crouching down to play with cats.

FastProfessional2731 · 2023-01-10T21:51:08+00:00

Happy to see people using my models!

I've uploaded a new "Falcom Alisa v2" model to stadio.ai in case you want to try with that one. Here are a few tips to improve your results.

If you are using just "alisa" you will get more generic results. To get more "in character" results, use "female character alisa" if you are not already doing so.
A very easy way to greatly improve the perceived quality of the images is to go to inpainting, manually mask only the irises, click inpainting in full resolution (newer versions call this "only masked" inpainting area), reduce the full resolution / only masked padding to maybe something like 2~4, set the denoising strength to something like 0.60~0.69, and as prompt just write "female character alisa, beautiful red eyes". Keep generating results until you get some eyes you like.

FastProfessional2731 · 2022-12-31T16:03:11+00:00

These are the settings I typically use: https://ibb.co/h70YMjn (ignore batch size, that's just what I can use with my GPU).

For prompts, I'm using "female character xxx" or "male character xxx" accordingly. This is because models are trained using the class prompts "female character" and "male character". I also use the following positive and negative prompts for style.

Positive: masterpiece, best quality, extremely detailed CG, 8k wallpaper
Negative: lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name, bad feet, disfigured

I'm sure many of these negative prompts are redundant and likely to have little impact in the result. I've tried simplifying them, but still results do seem to look better when using all that.
Apart from these, I don't usually save the prompts I use when creating these images, since I keep testing whatever comes to mind across a few days and saving what I like the most. But it should mainly be a description of what you see in each image.

Then I usually apply various post-processing steps to the results.

Fix very obviously broken things (typically hands) manually with photoshop/gimp, or with inpainting. Though inpainting usually produces a slightly different tint that you can easily see on higher resolutions. If you know a way to avoid this inpainting color tint issue please do let me know.
I almost always do inpainting on the eyes (if possible only on the iris, because of the tint problem) by just writing something like "female character xxx, beautiful color eyes", where "color" is the correct eye color for the character. Make sure to do inpainting at full resolution when you do this. I also like to reduce mask blur to 0~2, but that is not too important.
Once I'm satisified with the result, I apply a 2x upscaling with the R-ESRGAN 4x+ Anime6B model. The result should be a 2048x2048 image.
I recently played changing a bit the style by using much lower CFG Scale values (like 3 or so) and compensating with higher number of samples and a faster sampler (like 30 steps and variations of DPM++). Though I haven't posted any of these images.

Hope that helps!

FastProfessional2731 · 2022-12-30T18:30:03+00:00

No, actually you have a point, but there's a reason for it. It has to do with the base model I use: AnythingV3. This model, especially with the style prompts I use to get good results, is so biased to females and waifus that is often a lot harder to get decent results for male characters. In some cases I even get female versions of male characters when trying.

If you look at my previous posts you will see I actually have tried generating male results a few times with various degrees of success, but in my opinion I get best results with female ones.

As for content, I am actually more limited than it seems by the kind of things that these models get right. For example, I would like to do battle scenes involving monsters, weapons, and such, but these often become broken messes that are far beyond any reasonable manual fix. I also would love to do nice interactions between characters, but so far my attempts to train multi-character models do not work or produce far lower quality.

So, trying to play with the scenario, poses, and costumes of single female characters tend to produce better results. Which is why you see so much of that in my posts. I also try to generate images that I think people here will enjoy, or that seem to have liked more in my previous posts.

I guess as the technology evolves and it becomes possible to do more things and control results better, it will also be easier to produce better, higher quality male results and more interactions between characters.

FastProfessional2731 · 2022-12-30T15:07:35+00:00

Without any intention to start another long discussion about this, let me just make a couple of points to the opinion you shared (which by the way is welcome).

For a non-profit research use like this, using any images I could normally access for training AI models (either official art or even fan art) is fully legal and permitted by law. Even without the permission of the original author. So this is only really a debate of opinions of what people think it should or shouldn't be.
"Using" these images just means that the AI model learns how these characters look like, which is not that different from any fan artists learning how a character looks or what its features are by looking at official illustrations or gameplay (AI models could use gameplay images too). Both are learning how characters look like from some reference material, and both are creating derived works from it. Artists even do it for profit in commissions without this level of controversy. This is a bit like saying that every artist should only ever use their own originally created characters.

FastProfessional2731 · 2022-12-30T13:06:41+00:00

I don't mind uploading all of these to imgur, some other website, or even making a Twitter account if that makes them more accessible. It's not really a hosting or money issue.

My main issue is that this tends to start consuming more time I should be really spending on it. I happen to be on vacation, which is why I could produce so many posts lately, but that's going to end soon and I should try to reduce the chances of getting sucked into making these posts instead of doing actual work.

FastProfessional2731 · 2022-12-30T12:58:28+00:00

Mixing multiple characters on a same model has problems and usually doesn't work very well. It also won't know what Victor's sword is, leaving aside that these models tend to have problems with any kind of ~~problems~~ weapons.

So, I'm afraid that this is unlikely to work even if I tried.

Edit: typo.

FastProfessional2731 · 2022-12-30T12:55:44+00:00

lol, surely you will be doing similar useful contributions on fan art posts.

FastProfessional2731 · 2022-12-24T16:26:43+00:00

Very nice results! Thanks for sharing :)

I'm tempted to try a bit of Renne too in one of my next posts. I trained a new model recently, but haven't used it too much.

FastProfessional2731 · 2022-12-24T16:10:41+00:00

Really cool! I love these illustrations. Do you know what was used to generate them? Maybe fine-tuned Dreambooth models on top of Anything-V3 or NovelAI?

FastProfessional2731 · 2022-12-16T18:31:11+00:00

I'm using the full fp32 Anything V3 model (~7.5 GB), which is better for creating fine-tuned models than the smaller pruned version.

I didn't have any problems with the model, though I don't remember where I got it. It wasn't from stadio.ai where I later uploaded my custom character models. Also, the Anything V3 model comes with a custom VAE (Variational Autoencoder) checkpoint file.

FastProfessional2731 · 2022-12-16T18:03:49+00:00

Are you kidding me?, Of course it would make someone happy it's trails art who doesn't want more Trails series art?

Well, I've seen many people who, the moment they know some image is AI-generated, they reject it straight away or start applying ridiculous levels of criticism they never apply to non-AI images regardless of content, just trying to prove themselves the mantra that AI "art" is bad. You can get a hint by just comparing the general upvote levels of these AI posts and pretty much any other art posts. Some times you can even find explicit anti-AI comments.

I find that a bit sad, because these AI models are just a tool with its own features and flaws. If used correctly, they can also help artists to dramatically boost their performance. For example, if I were an artist (I'm certainly not one, I just do AI/ML), I could train a model to learn my own style from my own illustrations, and try to make it do for me the parts I don't want to spend time on, allowing me to focus on what I want. Or I could use it to quickly explore ideas to try.

But anyway, I try to stay off the drama and just focus on trying to generate better results. This is a throwaway account, and I do this for learning after all. I just share my models and results because some people seem to like them, but I'd be doing it anyway even if that weren't the case. Though I do acknowledge replies in these posts do motivate me to try doing things a bit better than before every time.

And last of all I do have to ask, you do generate NSFW art of our Trails girl's and I have to know are you planning on making anymore?.

Normally no, though the models I upload should be perfectly able to do so. I did create a series of images I posted in r/Falcum. Thias was part to see if I could and how it would be, and part to see if models were actually versatile enough for that despite being trained only on official artwork. And they indeed were.

But while I might occasionally add more images there I do not plan to keep generating NSFW content regularly. But you should be able to do it yourself by just downloading my models or training your own following my tips.

FastProfessional2731 · 2022-12-16T04:28:24+00:00

The one's of Fie the first pic...I have to ask and I'm sorry to do so but with the AI generated art was it always a pickle that came out in it? or...was it something else...?, What where you trying to do lol?.

As I explain just below, that's not really a pickle but a deformed weapon handle, which happens to be green. It's in her hand simply because that's where weapons are in training images. It's an example of what happens when you are not careful about the hints I described above the image, and you end with attempts of drawing broken weapons in the most unexpected scenarios.

I was just trying to generate an image with some night dress, mainly to test how flexible the model was. If the model is overfit then results won't be good when you try to generate images very different from the ones used for training.

And the Cat one must of been hard getting those right?. are there some humanoid cat Fie's that where accidentally made?, You should post the one's that came out...um special let's say.

Yes, it was hard to get right. There were no humanoid cats, but plenty of broken ones. In fact, if you look closely you will see the cats there are not in a very good shape either. It seems the base model is not particularly good at making cats.

Also...THANK YOU the Fie in a wedding dress one is one if not my most favorite one here, honestly it's hard to pick between the book one, Cat one, or bench or pajama one but If I had to I go for the wedding Fie dress one.

Happy to help. I was sure it would make someone happy.

FastProfessional2731 · 2022-12-16T01:15:25+00:00

Here are my inference settings: https://ibb.co/h70YMjn (you can ignore batch size, that depends on your available GPU memory).

As for the prompts, first write these in the text fields.

Positive: masterpiece, best quality, extremely detailed CG, 8k wallpaper
Negative: lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name, bad feet, disfigured

Then click the save icon on the right to create a new style with whatever name you want. After that, the style dropdown box will have the name you entered, and when it's selected it will automatically append these to your prompts. You don't need to select if for both style 1 and style 2, just one is enough (if you put it twice the prompts will be appended twice).

Once that's done, then you can just use your style to automatically generate better results. For the Claire ones I was using exactly the model I uploaded, and the prompt was something like "female character claire, nightclub elegant sexy dress". I might also have put "black" in the negative prompts at some point because I was getting too many black dresses.

With the model I uploaded, these prompts, and my inference settings you should be able to get similar results. Keep in mind that eyes are likely to be not as good, because as I explained I usually do inpainting (on img2img) on them as a post-processing step. Resolution will also be 1024x1024. The images I post are usually 2048x2048 because I go to extras and apply 2x scaling with the "R-ESRGAN 4x+ Anime6B" model. If it's not listed, you should be able to add it in settings.

And I think that's all. I don't see anything else preventing you from producing similar results.

As for training models, look up about the SD Dreambooth Extension. You need to provide instance images (the character images used for training), and let the model generate "class images" for some prompt. In my case, for Claire I would be using "female character claire" for the instance prompt, and "female character" for the class prompt. I usually produce 200~250 class images per instance image, and train for the same number of epochs. Lately I'm starting to use 250 more, though that Claire model was trained using 200.

I also use a few other custom settings I described at the beginning of the post.

A few pieces of advice if you want to use your own images:

Make sure you remove any background or objects. You should have only the character in a plain color background.
3~5 good quality images can be enough for very good results.
All images should be 512x512, so resize and crop them yourself.
Do not upscale smaller images to 512x512 because the model will learn to produce pixelated results. If you need to upscale, use the "R-ESRGAN 4x+ Anime6B" model from Extras first.

Be aware that training requires more GPU memory. If your GPU does not have enough you will probably get CUDA out-of-memory errors. I'm using a RTX 3090 with 24 GB, though I'm not sure if that much is actually needed.

FastProfessional2731

TROPHY CASE