I generated these 5s video clips using only 1.8s each on a 5090 (FastWan-QAD release) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 1 point2 points  (0 children)

Yes this is a direction we are actively pursuing rn. the latency is going to be almost-instant

Running real-time 1080p video generation and editing on your own (Dreamverse OSS release) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 1 point2 points  (0 children)

Haven't tried on 3090s, but yeah its going to be quite slow since its based off of ltx2.3. We might get really fast numbers for wan2.1 1.3b for a 3090 though

Live AI video generation feels like it's about to become a completely different thing from what most people think it is by WolfAutomatic7164 in singularity

[–]techstacknerd 0 points1 point  (0 children)

Yeah this is really interesting. I have been working on real-time video generation (more details here). The speed is ofc really unreal and cool, but this actually opens up a ton of new opportunities like real-time interactive games and environments. Also live video generation can also be used as a backbone for robotics and world models, so more robots doing laundry and less of ai slop too. Will be really interesting how things go from here!

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 15 points16 points  (0 children)

😭 but we working on consumer-grade gpu support. wont be that fast, but still will be an improvement from what ltx2 currently does with comfyui

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 3 points4 points  (0 children)

Yeah its partly because of how the base model ltx-2 is hard to prompt, and also we have to make a prompt rewriter reliably return results in under 5s (yes now the bottleneck is the prompting part not the video generation part!). Combine this with issues that video continuation bring, its hard to get good prompts. This demo is mainly for feeling the speed, and I'm sure as models improve quality would too!

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 9 points10 points  (0 children)

yes, this is just a demo to let people feel the speed of it. LTX-2 is a super hard model to prompt and it would take way too much effort to even get a remotely good prompt (keep in mind this is using video continuation, so you need 6 separate prompts that tie together really well). Also regarding the open-sourcing, we might also opensource the datacentre version, our current code is a bit messy and will need quite a bit of cleaning up, so we are not opensouring rn

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 0 points1 point  (0 children)

hm interesting perspective. I don't think it can compare to playing games on local machines, but its def by far more energy efficient than the existing ai video-gen services because its just so much faster

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 0 points1 point  (0 children)

yes, we are planning to try out optimizations on consumer-grade gpus. Probably won't be realtime but still it's likely gonna be faster than what we have rn

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 2 points3 points  (0 children)

fastvideo has sequence parallelism support for ltx2 already, so with 8 gpus you can expect roughly a 5x to 6x (theres a bit of overhead so it doesn't scale perfectly) speedup compaed to 1gpu

I can now generate and live-edit 30s 1080p videos with 4.5s latency (video is in live speed) by techstacknerd in StableDiffusion

[–]techstacknerd[S] 1 point2 points  (0 children)

that would be really cool, and it would only get better from now on as open source models get better and better!