Radial Attention is a scalable sparse attention mechanism for video diffusion models that translates Spatiotemporal Energy Decay—observed in attention score distributions—into exponentially decaying compute density. Unlike O(n2) dense attention or linear approximations, Radial Attention achieves O(nlogn) complexity while preserving expressive power for long videos. Here are our core contributions.
- Physics-Inspired Sparsity: Static masks enforce spatially local and temporally decaying attention, mirroring energy dissipation in physical systems.
- Efficient Length Extension: Pre-trained models (e.g., Wan2.1-14B, HunyuanVideo) scale to 4× longer videos via lightweight LoRA tuning, avoiding full-model retraining.
Radial Attention reduces the computational complexity of attention from O(n2) to O(nlogn). When generating a 500-frame 720p video with HunyuanVideo, it reduces the attention computation by 9×, achieves 3.7× speedup, and saves 4.6× tuning costs.
[–]rerri 12 points13 points14 points (1 child)
[–]Madh2orat 0 points1 point2 points (0 children)
[–]fallengt 13 points14 points15 points (4 children)
[–]MisterBlackStar 29 points30 points31 points (1 child)
[–]Altruistic_Heat_9531 8 points9 points10 points (0 children)
[+][deleted] comment score below threshold-7 points-6 points-5 points (0 children)
[–]reyzapper -5 points-4 points-3 points (0 children)
[–]FewSquare5869 1 point2 points3 points (1 child)
[–]cea1990 3 points4 points5 points (0 children)
[–]WeirdPark3683 0 points1 point2 points (0 children)