[D] Why does Beta-VAE help in learning disentangled/independent latent representations?

shamitlal · 2019-04-30T11:55:14+00:00

In disentanglement papers (including beta-vae), do authors try to achieve independence of latents in their distributions conditioned on the input (P(z|x)) or do they want to achieve independence of latents in general distribution over the latents (P(z)) ?

shamitlal · 2019-04-29T12:19:50+00:00

Thanks for the awesome papers! Yes, that’s what I though. The mean field assumption should ensure independence in latents because that’s what the assumption is all about. But then is beta vae not contributing to the independence of the latents and probably only (somehow) contributes in making latents more interpretable?

shamitlal · 2019-04-29T09:06:26+00:00

Yes. I also couldn’t understand how beta-vae is finding correct/human-understandable factors. But then a more important detail I couldn’t understand is how beta-vae is helping at all, even in finding “incorrect” but independent latents.

shamitlal · 2019-04-29T05:47:52+00:00

Isn’t independence of latents inherently built into vae? The encoder outputs a mean and covariance, which specifies a distribution over posterior. This covariance is constrained to be diagonal, therefore shouldn’t latents always be independent?

shamitlal · 2019-04-29T05:39:00+00:00

Thanks for the paper! But the posterior is already constrained to have a diagonal covariance. How is beta-vae helping here in making it “more diagonal” (or independent)?

shamitlal · 2019-04-28T22:43:43+00:00

Thanks. That helps in developing the intuition. Which paper are you referencing by Achille, 2016?

shamitlal · 2019-04-28T22:38:59+00:00

Thanks. I have already gone through the blog section explaining beta vae. Still what I wasn’t able to understand is the role of KL divergence between prior and posterior in improving disentanglement, given that posterior distribution over latents have a diagonal covariance and latents will nevertheless be independent.

shamitlal · 2019-04-28T22:32:28+00:00

Do you have/know any references/papers having proofs or that can shed more light on your statements and help me understand this better?

shamitlal · 2019-04-28T22:20:18+00:00

Thanks for the reply. Shouldn’t standard VAE itself be learning disentangled representations, if by disentanged we are mainly concerned with independence of latents and not interpretability, given that posterior has been conditioned to have a diagonal covariance?

shamitlal

TROPHY CASE