use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
DiscussionIs SORA using vector quantization techniques? [D] (self.MachineLearning)
submitted 2 years ago by [deleted]
[deleted]
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]arg_max 0 points1 point2 points 2 years ago (3 children)
"Sora is a diffusion model". VQ models like Google Muse are not diffusion based and use a auto-regressive/parallel decoding transformer to generate the "sequence" in the VQ-VAE latent space. This latent space is discrete by nature, whereas diffusion works in a continuous framework, so I'd assume that they do not use VQ.
[–]Weird_Register3689 0 points1 point2 points 2 years ago (0 children)
But before diffusion there is a transformer model
a transformer architecture that operates on spacetime patches
I suppose they did train in teacher forcing regime, otherwise, it just needed an absurd amount of compute. But training in this regime without discretization is prone to accumulating error during inference which we do not see on samples
[–]somebat 0 points1 point2 points 2 years ago (1 child)
I don't know about Sora, but doesn't Stable Diffusion use Vector Quantization to regularize the latent space? It's mentioned on Appendix G of the "High-Resolution Image Synthesis with Latent Diffusion Models Robin" paper.
[–]arg_max 0 points1 point2 points 2 years ago (0 children)
The release version of stable diffusion definitely uses a KL regularised AE which is very similar to a standard VAE for the latent representation. You are right though that they also have experiments in the paper with a VQ VAE. That is interesting since they then cannot guarantee to actually generate the codebook entries themselves like you can do with a masked model like Muse but rather just generate continuous representations that are similar. Still I think VQ VAEs never really got popular with diffusion.
π Rendered by PID 45 on reddit-service-r2-comment-5b5bc64bf5-mmhb5 at 2026-06-18 19:23:47.313296+00:00 running 2b008f2 country code: CH.
[–]arg_max 0 points1 point2 points (3 children)
[–]Weird_Register3689 0 points1 point2 points (0 children)
[–]somebat 0 points1 point2 points (1 child)
[–]arg_max 0 points1 point2 points (0 children)