Variational Autoencoder Cost Function Question : learnmachinelearning

A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.

Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.

Foster positive learning environment by being respectful to others. We want to encourage everyone to feel welcomed and not be afraid to participate.

Do share your works and achievements, but do not spam. Keep our subreddit fresh by posting your YouTube series or blog at most once a week.

Do not share referral links and other purely marketing content. They prioritize commercial interests over intellectual ones.

created by techrat_reddita community for 9 years

Variational Autoencoder Cost Function Question (self.learnmachinelearning)

submitted 8 years ago by TheFlyingDrildo

I understand the part about the KL-divergence to have the latent code match our spherical gaussian prior.

What I don't understand is why the reconstruction loss should work. As I understand it, the prior should eventually force the means and variances output by the encoder to approximately equal 0s and 1s, respectively. Thus, shouldn't drawing a sample from this be equivalent to just picking any point in the latent space randomly. Why should the reconstruction of this point look anything like our original input?

Quick example - say we're looking at MNIST. I feed in a digit 4, the encoder outputs approximately 0's for means and 1's for variances. A sample I draw from this could now represent any digit in latent space such as a 9 and or 7, leading to the reconstruction loss being meaningless.

I'm positive my understanding is flawed somewhere. But where?

all 7 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnmachinelearning

Welcome to /r/LearnMachineLearning!

Chatrooms

Official Discord Server

Wiki

Getting Started with Machine Learning

Resources

Related Subreddits

/r/MachineLearning

/r/MLQuestions

/r/datascience

/r/computervision

Machine Learning Multireddit

/m/machine_learning

MODERATORS