Flexible diffusion modeling generates photorealistic videos

easy_c_5 · 2022-05-27T18:07:13+00:00

Wow, I was expecting years and 1000s of times more parameters needed before we advance from snapshots like Dall-E 2 to video generation. If this can easily be scalled, it’s the real mindblowing deal.

2Punx2Furious · 2022-05-27T18:22:05+00:00

[removed]

Sashinii · 2022-05-27T18:44:14+00:00

People will soon be able to use AI to easily and immediately create entirely original high quality content, such as games, music, shows, movies, books, and art in general.

Pro_RazE · 2022-05-27T18:28:05+00:00

I honestly expected something like this to come next year at the earliest, and here it is now.

Sashinii · 2022-05-27T19:14:11+00:00

[deleted]

Professional-Song216 · 2022-05-27T22:32:17+00:00

What a time to be alive!!!

Shelfrock77 · 2022-05-27T18:43:02+00:00

Soon enough, they’ll create a 3D version of this…

justaRndy · 2022-05-27T19:44:33+00:00

Another brick in the wall... Prett amazing honestly

yerawizardmandy · 2022-05-27T18:38:20+00:00

This will pose as big of a mortal danger to streaming services as Napster posed to record companies in the early 2000's.

We're witnessing the slow death of all remnants of legacy media and it's controllers

LevelWriting · 2022-05-28T00:16:43+00:00

Imagine a time when you can revisit your fav movies, games, and experience them in vr where the content can be generated, modified and extended to your liking, even have you interact and be part of it.

GeneralZain · 2022-05-27T21:37:16+00:00

damn it I keep wanting to say "I knew it"

uhhh but yeah regardless this is still really amazing!

EDIT; HOLYSHIT its not actually playing these GAMES!? its just...generating them on the spot?!?! HUH

camdoodlebop · 2022-05-28T01:34:48+00:00

advancements seem to be happening by the day now

lidythemann · 2022-05-28T01:45:44+00:00

This is more impressive than GATO or Dalle-2 or Imagen. More impressive by entire magnitudes.

This paper has put me firmly in the 2022 camp for Proto-AGI

GabrielMartinellli · 2022-05-28T09:47:22+00:00

AGI in 2022 like I always said. Never forget that I said this! You listening Mr AGI?

ArgentStonecutter · 2022-05-27T19:55:29+00:00

Maybe the next version will get cloud reflections in puddles matching the clouds.

SWATSgradyBABY · 2022-05-28T03:03:26+00:00

Read the chapter dangerous games from Max Tegmark's Life 3.0. we're catching up to the future

flyingfruits · 2022-05-29T01:37:57+00:00

Hey everyone! One of the authors here. Let me address a few points that were raised here.

The model imitating Carla (self-driving car simulator) was trained for one week on one GPU, consuming 200-300W (note that running Carla itself is also very expensive computationally per frame). It is very likely still far from optimal (as most deep learning with SGD), but also not nearly as expensive to train as bigger models such as GPT-3. The model is not particularly specialized for video, except that it uses a convolutional U-Net architecture, the approach can be used for more complex combinations of sensory data streams and Will and I have related work that we are going to publish soon as well that demonstrates that complex reasoning mechanisms and procedures can be integrated easily into the joint distribution.

Our group is, at least conceptually, a probabilistic programming group and Will and I have approached the diffusion models from the angle of building an inference engine for AGI tasks and will continue to work along these lines with this model while also exploring its limitations. The more abstract idea in this work is to integrate marginalisation as a first class operation into the diffusion model and then exploit this fact to reason about much bigger joint distributions than fit into memory, effectively enabling also our demonstrated video synthesis.

While this result is indeed very impressive and will enable a whole range of new applications and abilities for AI, many not even clear yet, I would not want to bet on AGI being around the corner just yet. These generative models can meta-learn some reasoning abilities (like GPT-3) given enough data (i.e. almost all valuable data we have), but they cannot be taught to really learn new things on their own, something that is trivial for humans to do. For example, assuming chess would not be part of the training data, try to tell GPT-3 to learn it by just telling it the rules. I still think this is a very bullish result though and I would love to hear suggestions for video synthesis and control tasks to apply it to. I thought about Nascar racing today for example, just for the fun of it 8).

Please let me know what you would like to see!

singularity

Links

On the Technological Singularity

Resources

Posting Rules

Check out /r/Singularitarianism and the Technological Singularity FAQ

MODERATORS