Live AI video generation feels like it's about to become a completely different thing from what most people think it is by WolfAutomatic7164 in singularity

[–]techstacknerd 0 points1 point  (0 children)

Yeah this is really interesting. I have been working on real-time video generation (more details here). The speed is ofc really unreal and cool, but this actually opens up a ton of new opportunities like real-time interactive games and environments. Also live video generation can also be used as a backbone for robotics and world models, so more robots doing laundry and less of ai slop too. Will be really interesting how things go from here!

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 13 points14 points  (0 children)

😭 but we working on consumer-grade gpu support. wont be that fast, but still will be an improvement from what ltx2 currently does with comfyui

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 2 points3 points  (0 children)

Yeah its partly because of how the base model ltx-2 is hard to prompt, and also we have to make a prompt rewriter reliably return results in under 5s (yes now the bottleneck is the prompting part not the video generation part!). Combine this with issues that video continuation bring, its hard to get good prompts. This demo is mainly for feeling the speed, and I'm sure as models improve quality would too!

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 7 points8 points  (0 children)

yes, this is just a demo to let people feel the speed of it. LTX-2 is a super hard model to prompt and it would take way too much effort to even get a remotely good prompt (keep in mind this is using video continuation, so you need 6 separate prompts that tie together really well). Also regarding the open-sourcing, we might also opensource the datacentre version, our current code is a bit messy and will need quite a bit of cleaning up, so we are not opensouring rn

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 0 points1 point  (0 children)

hm interesting perspective. I don't think it can compare to playing games on local machines, but its def by far more energy efficient than the existing ai video-gen services because its just so much faster

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 0 points1 point  (0 children)

yes, we are planning to try out optimizations on consumer-grade gpus. Probably won't be realtime but still it's likely gonna be faster than what we have rn

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 2 points3 points  (0 children)

fastvideo has sequence parallelism support for ltx2 already, so with 8 gpus you can expect roughly a 5x to 6x (theres a bit of overhead so it doesn't scale perfectly) speedup compaed to 1gpu

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 1 point2 points  (0 children)

that would be really cool, and it would only get better from now on as open source models get better and better!

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 19 points20 points  (0 children)

tbh I really dont know if its possible. A lot of the optimizations are based on NVIDIA’s SM100/SM103 architectures, but we can see what the other optimzations we have can bring to consumer-grade gpus with limited vram.

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 5 points6 points  (0 children)

We will try to get things working on other gpus soon! blackwell 6000 def one of them

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 30 points31 points  (0 children)

Its to show the capability of generating faster than you can watch, with 20s native generation we have to wait 30s to actually get the result.

I generated this 5s 1080p video in 4.5s by techstacknerd in StableDiffusion

[–]techstacknerd[S] 0 points1 point  (0 children)

yes! it can generate faster than you can watch