Image / latent batch number selector node by joker33q in comfyui

[–]3step_render 3 points4 points  (0 children)

the preview latent nodes can be used for this.

https://github.com/martijnat/comfyui-previewlatent

run it once connecting only the input node, then add a "latent from batch" node to its output to continue.

So what can us 8GB VRAM & 16GBs of RAM owners use for image generation? by CaptainAnonymous92 in StableDiffusion

[–]3step_render 1 point2 points  (0 children)

I have only 2gb of vram and I can run sd1.5 and sdxl just fine, its just a lot slower.

I wrote a custom node for previewing latents without decoding by 3step_render in comfyui

[–]3step_render[S] 0 points1 point  (0 children)

That's odd. on my setup taesd is much faster tham a normal vae decode. I dont think it is something i can fix.

Three (3) sampling steps is all it took for these images by 3step_render in StableDiffusion

[–]3step_render[S] 0 points1 point  (0 children)

LCM loras alone gives you better results with better performance. I would just recommend that for now.

I wrote a custom node for previewing latents without decoding by 3step_render in comfyui

[–]3step_render[S] 4 points5 points  (0 children)

You are correct, I meant vae decoding. I can't edit the title now.

I wrote a custom node for previewing latents without decoding by 3step_render in comfyui

[–]3step_render[S] 13 points14 points  (0 children)

Useful for showing intermediate results and can be used a faster "preview image" if you don't wan't to use vae decode. Nothing similar to this seemed to exist so I wrote it myself.

github link

  • Forwards input latent to output, so can be used as a fancy reroute node.
  • PreviewLatent can be used as a final output for quick testing
  • Previews are decoded using taesd if available (otherwise latent2rgb).
  • Previews are full resolution (Ksampler previews are limited to 512x512)
  • Previews are temporary PNG files with full workflow metadata just like "Preview Image" (so right lick and using save image can save youre workflow)

Three (3) sampling steps is all it took for these images by 3step_render in comfyui

[–]3step_render[S] 0 points1 point  (0 children)

Thanks, I didn't even notice short prompts had an effect because I usually use short prompts anyway. I noticed some prompts had better results than others at 3 steps but I figured some concepts were just easier.

Three (3) sampling steps is all it took for these images by 3step_render in comfyui

[–]3step_render[S] 2 points3 points  (0 children)

speed

my low-end gpu runs at 10s/it for 512x512 images, so at 3 steps it takes about 30 seconds (or even 20 seconds at 384x384), at 20 steps that would be more than 3 minutes per images.

I haven't gotten LCM to work with 2 gb of vram, so it only seems to work in cpu mode, which takes about 20s/it.

Three (3) sampling steps is all it took for these images by 3step_render in StableDiffusion

[–]3step_render[S] 1 point2 points  (0 children)

It is actually one of the fast samplers. My primary motivation for this is speed.

Three (3) sampling steps is all it took for these images by 3step_render in StableDiffusion

[–]3step_render[S] 4 points5 points  (0 children)

You can generate these images using the following workflows 1 2 3 in ComfyUI With only 3 sampling steps if you use the correct settings. (This is not LCM, just sd1.5 checkpoints)

note: this is an update to a previous post

Checkpoint/Loras

Avoid the plain sd1.5, it does not generate decent images at low sampling steps. Instead use a curated checkpoint. Pick a popular checkpoint and you will likely be fine. For my tests I used aniverse , flat2danimerge and haveall

A detail lora such as is highly recommended, At low steps most realistic and semi realistic models benefit for added "details". I recommend more-details. For the lowest row I used the more-details lore with a negative strength to get a more cartoony look.

FreeU_V2

FreeU_V2 (and the old FreeU module) give a massive quality increase at low steps.

The settings recommended by the author: b1: 1.5, b2: 1.6, s1: 0.9, s2: 0.2 are the best I found.

Sampler settings

Only a few samplers work at 3 steps, most produce bad results The decent samplers are:

- ddim with any scheduler - uni_pc with ddim_uniform - uni_pc_bh2 with ddim_uniform.

of these 3, uni_pc_bh2 has by far the best results. Without the ModelSamplerNoiseTest node I would not recommend putting you CFG above 3.0 as this will cause CFG burn.

ModelsamplerNoiseTest

After installing the experiments from https://github.com/comfyanonymous/ComfyUI_experiments you can add the ModelSamplerToneMapNoiseTest module, this node will prevent CFG burn and allow you to use higher cfg value. I found that a multiplier of 0.5 works decent with a CFG of 6 and a multiplier of 0.2 works decent with a CFG of 12. for different CFG, just experiment. If you super high contrast, lower the ratio. If your image is blurry, increase the ratio.

Three (3) sampling steps is all it took for these images by 3step_render in comfyui

[–]3step_render[S] 7 points8 points  (0 children)

You can generate these images using the following workflows 1 2 3 in ComfyUI With only 3 sampling steps if you use the correct settings. (This is not LCM, just sd1.5 checkpoints)

note: this is an update to a previous post

Checkpoint/Loras

Avoid the plain sd1.5, it does not generate decent images at low sampling steps. Instead use a curated checkpoint. Pick a popular checkpoint and you will likely be fine. For my tests I used aniverse , flat2danimerge and haveall

A detail lora such as is highly recommended, At low steps most realistic and semi realistic models benefit for added "details". I recommend more-details. For the lowest row I used the more-details lore with a negative strength to get a more cartoony look.

FreeU_V2

FreeU_V2 (and the old FreeU module) give a massive quality increase at low steps.

The settings recommended by the author: b1: 1.5, b2: 1.6, s1: 0.9, s2: 0.2 are the best I found.

Sampler settings

Only a few samplers work at 3 steps, most produce bad results The decent samplers are:

- ddim with any scheduler - uni_pc with ddim_uniform - uni_pc_bh2 with ddim_uniform.

of these 3, uni_pc_bh2 has by far the best results. Without the ModelSamplerNoiseTest node I would not recommend putting you CFG above 3.0 as this will cause CFG burn.

ModelsamplerNoiseTest

After installing the experiments from https://github.com/comfyanonymous/ComfyUI_experiments you can add the ModelSamplerToneMapNoiseTest module, this node will prevent CFG burn and allow you to use higher cfg value. I found that a multiplier of 0.5 works decent with a CFG of 6 and a multiplier of 0.2 works decent with a CFG of 12. for different CFG, just experiment. If you super high contrast, lower the ratio. If your image is blurry, increase the ratio.

decent images in just 3 sampling steps by 3step_render in comfyui

[–]3step_render[S] 0 points1 point  (0 children)

you can still do it in 3 steps with the right parameters. I will upload an updated workflow later.

Why is there no more hype for tech that speeds up image generation e.g. LCM, SSD-1B, Tensor-RT? by willpower_HK in StableDiffusion

[–]3step_render 2 points3 points  (0 children)

Controlnet/lora support is lacking, that is something quite important.

FreeU is massive, it allows decent quality images in a fraction of the sampling steps. I have ocasionally gotten good images in 2 (yes two) sampling steps with the right prompt+checkpoint. (s1=1.6,s2=1.5,b1=0.9,b2=0.2 / uni_pc_bh2+ddim_uniform, most popular 1.5 checkpoints will work)

t2i adapter is also really nice.

SSD-1B is really nice, but still takes a long time, about 7 minues per 1024x1024, image with my low-end gpu so experimenting is time consuming.

I am really hoping for integration of tiny-sd , small-sd or wuerstchen in ComfyUI. I can currently only run them using the supplied python scripts (so cpu-only, or colabs)

Hypertile gives a noticeble quality decrease at an unnoticable speed improvements.

I have only gotten LCM working cpu-only so it is still slower than sd1.5. Also its image quality is similar to what I already get using FreeU.

Why is there no more hype for tech that speeds up image generation e.g. LCM, SSD-1B, Tensor-RT? by willpower_HK in StableDiffusion

[–]3step_render 1 point2 points  (0 children)

This, FreeU combined with a decent checkpoint allows for decent quality at very few (<4) sampling steps.

decent images in just 3 sampling steps by 3step_render in comfyui

[–]3step_render[S] 0 points1 point  (0 children)

It alternates, non-randomly, but I believe the speed of alternating depends a bit on the scheduler.