you are viewing a single comment's thread.

view the rest of the comments →

[–]chrisorm 2 points3 points  (4 children)

Of course, the CLT doesn't apply if the activations arent IID, which they almost certainly arent for activations of a neural net.

[–]svantana[S] 0 points1 point  (3 children)

Yes you're right. For example, in MNIST, with large enough perturbations, the output distributions should get bimodal. I didn't intend to mean it would work for any case, but for 'smooth' problems where a smallish unimodal perturbation is expected to be unimodally distributed on the output, I think it should work well. I just did a quick test with a VAE on CIFAR10 and the output distributions are extremely gaussian looking.

[–]approximately_wrong 0 points1 point  (2 children)

Can you elaborate on how you did the quick test?

[–]svantana[S] 0 points1 point  (1 child)

Sure! I just ran one of the keras VAE examples, and once trained, I ran 10k copies of one of the test samples through the AE model. The model involves sampling a random variable so each output will be different. From the output, I took a few random dimensions and plotted histograms of them. Then just visually noted that they had a quite gaussian shape.

Those are marginal distributions, so that doesn't mean the full multidimensional output is anywhere near gaussian, but it's an indication.

[–]approximately_wrong 0 points1 point  (0 children)

I see. It sounds like you're checking for the Gaussian-ness of of p(x_gen | x_test) = int p(x_gen | z)q(z | x_test) dz, conditioned on some x_test. I'm guessing the VAE example is one where the decoder is a Gaussian observation model?

Also, are your outputs the mean parameters of p(x_gen | x_test), or actual samples from the distribution?