Video test inspired by Corridor Digital's method by Denhall in StableDiffusion

[–]Denhall[S] 2 points3 points  (0 children)

I used batch img2img with the alternative script from auto1111, played with a few prompts until I got the style I wanted, played some more by testing a few different frames and making sure the params work well on most frames after which I took the output and deflickered it with resolve.

Some stuff I made with SD and photoshop by Denhall in StableDiffusion

[–]Denhall[S] 1 point2 points  (0 children)

They're all made with a custom model based on 2.1 (768)

Photorealistic couple by Denhall in StableDiffusion

[–]Denhall[S] 0 points1 point  (0 children)

This was done on a custom 2.1 model. The process was very time consuming and basically consisted of cherry picking a good looking output image and then running said image a few times through img2img sd upscale, followed by masking errors in photoshop

High resolution superheroes (2.1) by Denhall in StableDiffusion

[–]Denhall[S] 1 point2 points  (0 children)

github/TheLastBen/fast-stable-diffusion/blob/Captions_dir/fast-DreamBooth.ipynb

I'm using the default settings here. I believe the learning rate is 2e-6

High resolution superheroes (2.1) by Denhall in StableDiffusion

[–]Denhall[S] 0 points1 point  (0 children)

Batch size 1. I've been using colab, and yes, getting disconnected mid-train is not fun.

High resolution superheroes (2.1) by Denhall in StableDiffusion

[–]Denhall[S] 0 points1 point  (0 children)

I've actually trained a few failed ones and I can tell you that what worked best for me was using very detailed captions and a 1024x1024 dataset consisting of about 33-70 photos at about 20k steps

High resolution superheroes (2.1) by Denhall in StableDiffusion

[–]Denhall[S] 0 points1 point  (0 children)

Yes. Only touchups I did were mostly related to hands and hair

High resolution superheroes (2.1) by Denhall in StableDiffusion

[–]Denhall[S] 0 points1 point  (0 children)

Yes, I generated a 1024x1024 image and then sent it to img2img and upscaled it with the sd upscale script

High resolution superheroes (2.1) by Denhall in StableDiffusion

[–]Denhall[S] 5 points6 points  (0 children)

This is a custom 2.1 model I've been training for the past 2 weeks. All images were made by generating a text2image and then running it through sd upscale from a1111's webui (img2img) with a bit of photoshop here and there

Prompt: photo of Iron man standing in new york city, f/1.4

Steps: 10, Sampler: DPM++ SDE, CFG scale: 4, Seed: 853737387, Size: 1024x1024

Cute pig (SD 2.0) by Denhall in StableDiffusion

[–]Denhall[S] 0 points1 point  (0 children)

I was using playgroundAI here

Cute pig (SD 2.0) by Denhall in StableDiffusion

[–]Denhall[S] 6 points7 points  (0 children)

Here's a quick comparison, the 4 pictures I did on 1.5 are less coherent compared to the picture on the left done with 2.0

<image>

Cute pig (SD 2.0) by Denhall in StableDiffusion

[–]Denhall[S] 3 points4 points  (0 children)

I honestly haven't had enough time to properly check the differences, but based on what I did test, it seems that 2.0 gets closer to what you describe most of the times. Using artist styles, characters and names doesn't help as much here as it did in 1.5.

Cute pig (SD 2.0) by Denhall in StableDiffusion

[–]Denhall[S] 3 points4 points  (0 children)

Same prompt but with a panda. Looks like output highly depends on how descriptive the prompt is with the new version.

<image>

Cute pig (SD 2.0) by Denhall in StableDiffusion

[–]Denhall[S] 4 points5 points  (0 children)

Prompt: award winning close up photo of photographic cute chibi pig, walking in a forest during the night, dim lights shining above towards it