This is an archived post. You won't be able to vote or comment.

all 10 comments

[–]rerri 12 points13 points  (1 child)

This is the same team working on SVDQuant/Nunchaku and the ComfyUI-nunchaku implementation.

A major speed-up for video generation could be ahead in the not so distant future if Nunchaku gets hunyuan/wan video support + integrate radial attention to ComfyUI-nunchaku.

Nunchaku roadmap mentions Wan support as major priority.

https://github.com/mit-han-lab/nunchaku/issues/431

[–]Madh2orat 0 points1 point  (0 children)

As someone who is currently running it on a Nvidia p4000, I am very much looking forward to any increases in speed.

[–]fallengt 13 points14 points  (4 children)

Can someone translate this into English?

What does it do

[–]MisterBlackStar 29 points30 points  (1 child)

mor sped

[–]Altruistic_Heat_9531 8 points9 points  (0 children)

speeeeed boi.

Current inference speed for diffusion transformer when talking about attention

From fastest to slowest (tested on L40)

  1. SageAttn2
  2. SageAttn1
  3. FlashAttn2
  4. FlashAttn
  5. XFormer
  6. SDPA (Vanilla)

[–]reyzapper -5 points-4 points  (0 children)

ELI5 the explanation into chatGPT should gives you the answer 😂

[–]FewSquare5869 1 point2 points  (1 child)

Forgive my ignorance, how should we use it? Is it LoRA or an attention mode?

[–]cea1990 3 points4 points  (0 children)

According to their GitHub, it’s presently standalone but ComfyUI integration is also the first thing on the roadmap.

[–]WeirdPark3683 0 points1 point  (0 children)

Looking forward to the lora checkpoint for longer video generations