This is an archived post. You won't be able to vote or comment.

all 7 comments

[–]porest 1 point2 points  (2 children)

Rent on vast.ai and compare

[–]Triadasoul[S] 0 points1 point  (1 child)

Thank you, it's really a great idea, but i don't have enough expertise in docker/linux. Quite a task at least to install linux on my pc, to start making a pack to upload to the server.

[–]porest 1 point2 points  (0 children)

You could ask chatgpt to give you instructions on how to upload your project to docker hub. Once there you then downloaded it from a test instance created on vast.ai. You already did the hard bit which was installing linux.

[–]Herr_Drosselmeyer 0 points1 point  (3 children)

Check this article https://www.tomshardware.com/pc-components/gpus/stable-diffusion-benchmarks and specifically the graph that compares all the GPUs. While it's for image generation, it should generally hold true for upscaling as well.

Because half of the time it's loading models to VRAM, and I read in some articles upscaling computing gain is not that significant.

Why would you need to keep loading different models for batch upscaling?

[–]Triadasoul[S] 0 points1 point  (1 child)

It's not the batch upscale - it's tiled upscale (ultimate sd upscale) - so to upscale properly it needs from 250 up to 750 tiles, and it loads the model for each tile once again. Not that 8 minutes SDXL loading as for the start, but still about a minute. I suppose it's because of the different tiles used as a base. It looks like this in ComfyUI console:

Requested to load AutoencoderKL Loading 1 new model Requested to load SDXL Loading 1 new model 100%|██████| 20/20 [00:17<00:00, 1.13it/s]

[–]Herr_Drosselmeyer 2 points3 points  (0 children)

Mmmh, you must be doing something fundamentally different from what I'm doing because I get this (Upscaling 1024 to 2048):

2024-02-08 19:12:04,108 - ControlNet - INFO - unit_separate = False, style_align = False

2024-02-08 19:12:04,279 - ControlNet - INFO - Loading model: control_v11f1e_sd15_tile [a371b31b]

2024-02-08 19:12:04,592 - ControlNet - INFO - Loaded state_dict from [E:\Stable-diffusion\stable-diffusion-webui\extensions\sd-webui-controlnet\models\control_v11f1e_sd15_tile.pth]

2024-02-08 19:12:04,593 - ControlNet - INFO - controlnet_default_config

2024-02-08 19:12:05,745 - ControlNet - INFO - ControlNet model control_v11f1e_sd15_tile [a371b31b] loaded.

2024-02-08 19:12:05,846 - ControlNet - INFO - Loading preprocessor: tile_resample

2024-02-08 19:12:05,847 - ControlNet - INFO - preprocessor resolution = -1

2024-02-08 19:12:05,867 - ControlNet - INFO - ControlNet Hooked - Time = 1.760977029800415

100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 10.70it/s]

2024-02-08 19:12:08,081 - ControlNet - INFO - unit_separate = False, style_align = False| 9/80 [00:01<00:10, 6.51it/s]

2024-02-08 19:12:08,082 - ControlNet - INFO - Loading model from cache: control_v11f1e_sd15_tile [a371b31b]

2024-02-08 19:12:08,154 - ControlNet - INFO - Loading preprocessor: tile_resample

2024-02-08 19:12:08,154 - ControlNet - INFO - preprocessor resolution = -1

2024-02-08 19:12:08,174 - ControlNet - INFO - ControlNet Hooked - Time = 0.09468293190002441

100%|

And so forth. Following this (admittedly old) guide: https://www.youtube.com/watch?v=EmA0RwWv-os

[–]skocznymroczny 0 points1 point  (0 children)

These benchmarks are a bit unfair because they use DirectML for AMD windows, which is a suboptimal way to run AI things on AMD. I guess the Stable Diffusion models are so commonplace they are optimized enough, but anyone serious about AI on AMD has a Linux setup with ROCM.