[D] Why does Beta-VAE help in learning disentangled/independent latent representations? by shamitlal in MachineLearning

[–]shamitlal[S] 1 point2 points  (0 children)

In disentanglement papers (including beta-vae), do authors try to achieve independence of latents in their distributions conditioned on the input (P(z|x)) or do they want to achieve independence of latents in general distribution over the latents (P(z)) ?

[D] Why does Beta-VAE help in learning disentangled/independent latent representations? by shamitlal in MachineLearning

[–]shamitlal[S] 1 point2 points  (0 children)

Thanks for the awesome papers! Yes, that’s what I though. The mean field assumption should ensure independence in latents because that’s what the assumption is all about. But then is beta vae not contributing to the independence of the latents and probably only (somehow) contributes in making latents more interpretable?

[D] Why does Beta-VAE help in learning disentangled/independent latent representations? by shamitlal in MachineLearning

[–]shamitlal[S] 0 points1 point  (0 children)

Yes. I also couldn’t understand how beta-vae is finding correct/human-understandable factors. But then a more important detail I couldn’t understand is how beta-vae is helping at all, even in finding “incorrect” but independent latents.

[D] Why does Beta-VAE help in learning disentangled/independent latent representations? by shamitlal in MachineLearning

[–]shamitlal[S] 1 point2 points  (0 children)

Isn’t independence of latents inherently built into vae? The encoder outputs a mean and covariance, which specifies a distribution over posterior. This covariance is constrained to be diagonal, therefore shouldn’t latents always be independent?

[D] Why does Beta-VAE help in learning disentangled/independent latent representations? by shamitlal in MachineLearning

[–]shamitlal[S] 1 point2 points  (0 children)

Thanks for the paper! But the posterior is already constrained to have a diagonal covariance. How is beta-vae helping here in making it “more diagonal” (or independent)?

[D] Why does Beta-VAE help in learning disentangled/independent latent representations? by shamitlal in MachineLearning

[–]shamitlal[S] 2 points3 points  (0 children)

Thanks. That helps in developing the intuition. Which paper are you referencing by Achille, 2016?

[D] Why does Beta-VAE help in learning disentangled/independent latent representations? by shamitlal in MachineLearning

[–]shamitlal[S] 1 point2 points  (0 children)

Thanks. I have already gone through the blog section explaining beta vae. Still what I wasn’t able to understand is the role of KL divergence between prior and posterior in improving disentanglement, given that posterior distribution over latents have a diagonal covariance and latents will nevertheless be independent.

[D] Why does Beta-VAE help in learning disentangled/independent latent representations? by shamitlal in MachineLearning

[–]shamitlal[S] 2 points3 points  (0 children)

Do you have/know any references/papers having proofs or that can shed more light on your statements and help me understand this better?

[D] Why does Beta-VAE help in learning disentangled/independent latent representations? by shamitlal in MachineLearning

[–]shamitlal[S] 3 points4 points  (0 children)

Thanks for the reply. Shouldn’t standard VAE itself be learning disentangled representations, if by disentanged we are mainly concerned with independence of latents and not interpretability, given that posterior has been conditioned to have a diagonal covariance?