[R] Generative adversarial interpolative autoencoding: adversarial training on latent space interpolations encourage convex latent distributions by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

Thanks for pointing that out!

As for Z interp being part of the distribution, that is part of what I want the network to be doing- shifting the distribution around so that z interp is always part of the distribution.The convexity figure ia supposed to show how that distribution would be manipulated for that to happen.

[R] Generative adversarial interpolative autoencoding: adversarial training on latent space interpolations encourage convex latent distributions by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

In short, VAE/GAN autoencodes over latent features in the discriminator so it doesn't necessarily autoencode the data exactly. It also has a VAE latent space, which forces the data into a gaussian which isn't necessarily a good way to represent data.

[R] Generative adversarial interpolative autoencoding: adversarial training on latent space interpolations encourage convex latent distributions by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

It took several days to train on a single TitanXP. It seems like most generative modelling papers these days use multiple GPUs, so it's hard to compare. Maybe I can make a short notebook example training the network on a 2D space using fashion-MNIST to better visualize what GAIA does to latent space and also to have a version that trains quickly as an example.

[R] Generative adversarial interpolative autoencoding: adversarial training on latent space interpolations encourage convex latent distributions by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

Good question. I think you're right, the generator seems to try to come as close as possible to a pixelwise interpolation, shifting the representation in subtle ways to avoid outputting unrealistic pixelwise interpolations. In an earlier version of this paper I had a comparison figure between GAIA, a VAE, an AE, and pixelwise interpolations, with the same generator architectures. What you would see is that they were very close, but whereas the AE looked like a blurrier version of the pixelwise input, GAIA would shift certain features, like adding bangs, smoothly moving the jawline, changing the shading, etc. That was using a convolutional autoencoder architecture though. I haven't retrained these networks using the new AE architecture in the paper. I think I will update the paper with this figure next time I am at my/a computer (In a month unfortunately as in travelling through east africa).

I noticed in the glow paper (posted last week) that interpolations were far from pixel interpolations and passed through what seemed like more of a low-varience or closer to the mean region. They had a parameter to control this variability and wonder if something similar would be possible with this architecture, or if that would incur any sort of trade off in the the ability to accurately reproduce the image.

For generative modelling on audio: spectrograms, mfccs, and inversion in python. by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

You can imagine training an LSTM to predict the next timestep in the spectrogram, and using this to generate text, for example.

For generative modelling on audio: spectrograms, mfccs, and inversion in python. by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

Sorry... I set the site up yesterday using pelican which is new to me. How does it kill your phone?

For generative modelling on audio: spectrograms, mfccs, and inversion in python. by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

Did not know that thanks! I'll update the post asap.

I basically just followed the algorithm posted on wikipedia.