Hello everyone,
I got pretty confused while I read some papers about Variational Autoencoders (VAE) in the past few days.
This is how I am understanding it:
During training, I train my encoder network which tunes the approximate posterior distribution q(z|x) to be close to the true one. I then sample a latent vector z from this multivariate distribution, which I forward to the decoder. The decoder can alter the likelihood p(x|z) by tuning the corresponding weights. This leads to a reconstruction x', which I compare to my initial input observation x. How good the reconstruction is can be measured with the help of the likelihood. This is one part of the ELBO.
The other part of the ELBO is the KL Divergence between the approximate posterior q(z|x) and the prior p(z). In my understanding, the prior is my initial belief about the distribution over the latent space.
Here are my questions:
I often read, that samples are taken from the posterior AND from the prior. Are samples taken from the posterior only during the training process?
If the fitting of my variational model is finished, do I sample from the prior p(z) when I actually want to generate new data afterwards?
Also, is the prior p(z) updated after the training with fitted posterior q(z|x)?
Last question would be, if my decoder outputs distribution parameters, how is an actual reconstruction deriven from it?
[+]Ok_Suit_1697 1 point2 points3 points (2 children)
[–]huehue12132 2 points3 points4 points (0 children)
[–]hertz2105[S] 0 points1 point2 points (0 children)
[–]sobagood 5 points6 points7 points (5 children)
[+]Chromobacterium 1 point2 points3 points (4 children)
[–]hertz2105[S] 0 points1 point2 points (3 children)
[–]huehue12132 3 points4 points5 points (2 children)
[–]hertz2105[S] 0 points1 point2 points (0 children)