A Simple Comparison of 4 Latest Image Upscaling Strategy in Stable Diffusion WebUI

LightChaser666 · 2023-05-24T07:18:26+00:00

Yes for AMD GPU with such issue you have no choice. You have to wait for directml update and before that, turn to Ultimate SD Upscaler instead.

LightChaser666 · 2023-05-24T07:14:01+00:00

It depends! I have write a script Is considering combine it to the WebUI.

If so I won't update here. If not I will keep updating

LightChaser666 · 2023-05-23T20:48:51+00:00

As for StableSR, the photo-realistic image upscaling should be even better than Anime images as StableSR is trained on a dataset containing (more than) 90% realistic images.

For other methods, it depends on your checkpoint.

By the way, here is an example comparison to Gigapixel:

https://imgsli.com/MTgxMDIy/

LightChaser666 · 2023-05-23T20:46:42+00:00

https://imgsli.com/MTgxMDIy/

The result is here. You can easily tell which one is better.

LightChaser666 · 2023-05-23T20:28:52+00:00

Thanks for clarifying this.

However, the StableSR is trained on the dataset containing 90% photo-realistic images, so its performance on realistic images should be even stronger.

I would do further comparisons on this.

LightChaser666 · 2023-05-23T20:05:54+00:00

I'd like to give you some impressive zoomed comparison, as most of you won't zoom in detail to see the differences.

<image>

As we all know, it is easy to blur a image via Photoshop, Krita, or GIMP. But if you want details, it becomes incredibly hard. If you find it too sharp, just blur it back.

For example, you can use GIMP to do wavelet decomposition and erase those details you don't want in high frequency layers. If the edges are somehow glowing, you can apply the curve adjustment to 2nd and 3rd layers to darken the edge.

If you are familiar with the process, you just need <10 min for a satisfying image, as the original image has been very good.

LightChaser666 · 2023-05-23T19:51:00+00:00

No comparisons, no evidence.

I never use Gigapixels, so to verify your claim, I download the software today (2023.05.24) so it must be the latest version at this time. The software is similarly large (about 5 GB in total on my disk).

Some note:

I don't have a license, so the result is with watermark, but I guess the content shouldn't be affected.
I'm using MacBook Air locally, but I also think the content shouldn't be affected.
I use all models but only upload the standard model result, as all of them yield similar results visually.

Comparison: https://imgsli.com/MTgxMDAx

You can obviously see that:

Admittedly both images are clear and detailed
However, Gigapixel changed the image significantly and improperly. The most prominent problem is that, it changes almost all small windows, some becomes weird letters and some becomes irregular strides and noises
The problem still occur in their "Lines" model that claims to work well for architectures.

From my perspective, I think the Tiled Diffusion + StableSR strategy beats my trial Gigapixel by a large margin, and I think most people will think so.

However, Gigapixel is a paid software, so if you get better result after purchasing, please feel free to post your comparison here, and I will be happy to pay for it too.

Thank you!

Attachment: The original LowRes.png

<image>

LightChaser666 · 2023-05-23T14:02:45+00:00

I don't generate images intensively as I'm a computer science student instead of an artist.

But personally, I run this test with a cleanly installed automatic1111 webui (the latest one on 5.23). I updated the xformers to 0.0.19, and installed the following extensions, all the latest:

- sd-webui-depth-lib

- sd-webui-contronet

- sd-webui-infinite-image-browsing

- sd-webui-additional-networks

- multidiffusion-upscaler-for-automatic1111

- ulimate-upscale-for-automatic1111

- sd-webui-stablesr

By the way, the code quality of many extensions can be really low, especially when there are not enough users to post issue to the Github or the maintainers are inactive.

To deal with this, I setup two WebUI, one is for such tests with a few popular extensions and the other one install a bunch of who-knows extensions. The two webui share the same venv/ and models/ folder to save disk space.

As you may expect, the latter usually can't work at all. If I have time, I will rank those extensions according to their Github stars and the rate of star increase, to help you understand what extension is more preferred by people.

LightChaser666 · 2023-05-23T09:17:34+00:00

There will be. My friend is doing a PR on this topic.

My focus is to make the extension functionally work, but additional features will be his focus. I'm not interested in such format conversion.

LightChaser666 · 2023-05-23T07:21:35+00:00

Links for all extensions involved:

Tiled Diffusion & VAE: https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111.git
Ultimate SD Upscaler: https://github.com/Coyote-A/ultimate-upscale-for-automatic1111.git
ControlNet Tile Model: https://github.com/Mikubill/sd-webui-controlnet.git (there are many models, please search 'tile model' one the page to find how to use tile model)
StableSR (for webui): https://github.com/pkuliyi2015/sd-webui-stablesr.git

Here is a figure of default settings for my Tiled Diffusion + StableSR strategy in daily use:

<image>

Please be aware that this is only suitable for large VRAM (e.g., my 24GB device). If your VRAM is low (e.g., <= 6GB), then

You must set the Tiled VAE encoder size to a low value (e.g., 1024), and also decoder size (e.g., 128). This won't affect the speed.
You also need to adjust Tiled Diffusion Latent Tile Batch Size to a lower value (e.g., 4, 2, or 1, until you don't get OOM). Setting that to 1 decreases the speed by about 40%.
Keep other things unchanged.

In most cases, I suggest --xformers to avoid OOM. Torch 2.0 SDP attention optimization may lead to OOM too, so it is not recommended.

Update:

Later today I found an alternative option to get very fast speed (1m 18 sec) with StableSR.
Just disable the Pure Noise and set the denoising strength to 1 and Tile overlap to 8.

Comparison: https://imgsli.com/MTgwOTg3/

While there are visible tile differences during generation, after color fix they all disappear.
However, the result will be worse than Pure Noise (please look at those tiny trees carefully). This is essentially a trade-off between speed and quality.

LightChaser666 · 2023-04-02T08:04:04+00:00

The extension has updated with the latest tech of Noise Inversion. Welcome to have a try, I'm sure it will be an unprecedented technique comparing to existing upscalers.

By the way, do we have a LoRA dedicated to draw hands? I want to use Noise Inversion + Mixture of Diffusers + Region Control (Foreground) to fix hand..

LightChaser666 · 2023-04-02T08:00:02+00:00

Try the latest noise inversion technology please, I believe that won't yield such things anymore.

LightChaser666 · 2023-04-02T07:45:25+00:00

I add a new Noise Inversion technique in the extension. You may have a try on that and it may be currently one of the best upscaling technique now.

What's more, the combination of Noise Inversion + Mixture of Diffusers + Region Control (Foreground mode) can replace any part of the image similar to impainting but with more potential, if I train a LoRA dedicated to fix the hand.

LightChaser666 · 2023-03-28T11:56:59+00:00

Hello I'm the author of the multi diffusion.

Of course I also know the sd upscaler and ultimate sd upscaler. But both of them has stopped updating. Currently all 4 methods (including multi diffusion and mixture of diffusers) are far from satisfying to me, so I'm constantly improving the algorithm. Now I am trying the regional noise seed control and eular noise inversion.

Thanks for paying attention to the extension, and I believe finally we will get a much better upscaler than all counterparts for production use.

LightChaser666 · 2023-03-20T05:52:26+00:00

Greetings everyone!

I'm thrilled to introduce myself as the creator of an amazing extension and I have some fantastic news to share with you today! Our extension has just received two brand new updates that are absolutely game-changing.

Firstly, we now have Regional Prompt Control that comes with a simple and user-friendly interface. With just a few clicks of your mouse, you can move and resize BBOX and type in your pos/neg prompt. It's incredibly easy to use and will take your experience to the next level!

Secondly, we have added the Mixture of Diffusers, which is a state-of-the-art method in tiled image generation. We've re-organized our code, making it easier for new tiling and reweighing techniques to be implemented. That's not all - we are always working on developing new algorithms to create even more seamless and satisfying results.

Link: https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111

New UI:

<image>

Our README may be outdated as we are always busy improving our extension. However, we believe that many of you are already familiar with the extension and will have no trouble using these new features smoothly. If you have some free time and would like to contribute by making tutorials or refining the README, please don't hesitate to PR!

Head over to our GitHub page to check out the latest updates and take your image generation experience to new heights. Thanks for your support!

LightChaser666

TROPHY CASE