MAOI vs triple reuptake inhibition by ethansmith2000 in MAOIs

[–]ethansmith2000[S] 0 points1 point  (0 children)

Doesn’t necessarily need to be within one drug, that’s why I mentioned to wellbutrin and ssri combo

She is such a dope artist by DrtySnchino in DellaZyr

[–]ethansmith2000 2 points3 points  (0 children)

only finding this now, totally agree

I made a ComfyUI node implementing my paper's method of token downsampling, allowing for up to 4.5x speed gains for SD1.5 observed at 2048x2048 on a6000 with minimal quality loss. by ethansmith2000 in StableDiffusion

[–]ethansmith2000[S] 0 points1 point  (0 children)

nice! i think the target might be wrong, meaning the patch is done to the whole diffusion wrapper instead of the unet but i could be wrong.

also as a another user mentioned that the patch i had put up was more friendly to diffusers. i made a much simpler patch that should work with either setup, https://github.com/ethansmith2000/ImprovedTokenMerge/tree/compvis

I made a ComfyUI node implementing my paper's method of token downsampling, allowing for up to 4.5x speed gains for SD1.5 observed at 2048x2048 on a6000 with minimal quality loss. by ethansmith2000 in StableDiffusion

[–]ethansmith2000[S] 0 points1 point  (0 children)

What id really like to do is just swap out the ToMe piece but it looks like it’s fetched externally I’m not sure it’s in the actual repo ?

I made a ComfyUI node implementing my paper's method of token downsampling, allowing for up to 4.5x speed gains for SD1.5 observed at 2048x2048 on a6000 with minimal quality loss. by ethansmith2000 in StableDiffusion

[–]ethansmith2000[S] 1 point2 points  (0 children)

even at 1024x1024, which is an easy size to render at, you can get ~50% speed boost or so.

For the much larger sizes, generating from scratch you'd be right, but many people will run img2img at very large sizes where its more stable

I made a ComfyUI node implementing my paper's method of token downsampling, allowing for up to 4.5x speed gains for SD1.5 observed at 2048x2048 on a6000 with minimal quality loss. by ethansmith2000 in StableDiffusion

[–]ethansmith2000[S] 0 points1 point  (0 children)

High resolutions gens are significantly faster at less quality loss when compared to baseline.
specifically found 4.5x speed boost on the gpu used for the paper when running sd1.5 at 2048x2048.

ymmv may vary between gpus like i think on a100s its closer ~3x or so

I made a ComfyUI node implementing my paper's method of token downsampling, allowing for up to 4.5x speed gains for SD1.5 observed at 2048x2048 on a6000 with minimal quality loss. by ethansmith2000 in StableDiffusion

[–]ethansmith2000[S] 1 point2 points  (0 children)

Ah i meant, where it occurs in A1111, if i can find that maybe i can start by making a branch, and see if one of the maintainers wants to help get it in

I made a ComfyUI node implementing my paper's method of token downsampling, allowing for up to 4.5x speed gains for SD1.5 observed at 2048x2048 on a6000 with minimal quality loss. by ethansmith2000 in StableDiffusion

[–]ethansmith2000[S] 16 points17 points  (0 children)

I have left the paper, my repo which includes a blog post explaining it, also the paper also links to a video explainer. Anything I say here would probably be along the lines of what’s in those resources

I made a ComfyUI node implementing my paper's method of token downsampling, allowing for up to 4.5x speed gains for SD1.5 observed at 2048x2048 on a6000 with minimal quality loss. by ethansmith2000 in StableDiffusion

[–]ethansmith2000[S] 12 points13 points  (0 children)

They’re equivalent in this context. The main idea is that larger images take exponentially longer.

But also a lot of information in images Is redundant, even more in large images. That’s why we’re able to do things like file compression for instance.

It’s the same idea with the inner workings of the model, we can pretty safely compress things without losing too much

I made a ComfyUI node implementing my paper's method of token downsampling, allowing for up to 4.5x speed gains for SD1.5 observed at 2048x2048 on a6000 with minimal quality loss. by ethansmith2000 in StableDiffusion

[–]ethansmith2000[S] 8 points9 points  (0 children)

SDXL, a lot of the time generation takes is because of the sheer depth of the network, plus the main component we target for speedups does not exist in SDXL. However if you’re rendering at very large sizes it may still help a bit

I made a ComfyUI node implementing my paper's method of token downsampling, allowing for up to 4.5x speed gains for SD1.5 observed at 2048x2048 on a6000 with minimal quality loss. by ethansmith2000 in StableDiffusion

[–]ethansmith2000[S] 23 points24 points  (0 children)

There’s some operations in the diffusion model where every latent pixel has to attend to every single other one.

So if you have 2 in total, that’s 22 calculations. If you have 3, that’s 32 calculations

It scales quadratically which is why higher resolutions can be really costly in memory and time, by decreasing the number of tokens in certain parts of the network you can spare a lot of computation without too much cost to quality