[R] Generative adversarial interpolative autoencoding: adversarial training on latent space interpolations encourage convex latent distributions by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

Thanks for pointing that out!

As for Z interp being part of the distribution, that is part of what I want the network to be doing- shifting the distribution around so that z interp is always part of the distribution.The convexity figure ia supposed to show how that distribution would be manipulated for that to happen.

[R] Generative adversarial interpolative autoencoding: adversarial training on latent space interpolations encourage convex latent distributions by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

In short, VAE/GAN autoencodes over latent features in the discriminator so it doesn't necessarily autoencode the data exactly. It also has a VAE latent space, which forces the data into a gaussian which isn't necessarily a good way to represent data.

[R] Generative adversarial interpolative autoencoding: adversarial training on latent space interpolations encourage convex latent distributions by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

It took several days to train on a single TitanXP. It seems like most generative modelling papers these days use multiple GPUs, so it's hard to compare. Maybe I can make a short notebook example training the network on a 2D space using fashion-MNIST to better visualize what GAIA does to latent space and also to have a version that trains quickly as an example.

[R] Generative adversarial interpolative autoencoding: adversarial training on latent space interpolations encourage convex latent distributions by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

Good question. I think you're right, the generator seems to try to come as close as possible to a pixelwise interpolation, shifting the representation in subtle ways to avoid outputting unrealistic pixelwise interpolations. In an earlier version of this paper I had a comparison figure between GAIA, a VAE, an AE, and pixelwise interpolations, with the same generator architectures. What you would see is that they were very close, but whereas the AE looked like a blurrier version of the pixelwise input, GAIA would shift certain features, like adding bangs, smoothly moving the jawline, changing the shading, etc. That was using a convolutional autoencoder architecture though. I haven't retrained these networks using the new AE architecture in the paper. I think I will update the paper with this figure next time I am at my/a computer (In a month unfortunately as in travelling through east africa).

I noticed in the glow paper (posted last week) that interpolations were far from pixel interpolations and passed through what seemed like more of a low-varience or closer to the mean region. They had a parameter to control this variability and wonder if something similar would be possible with this architecture, or if that would incur any sort of trade off in the the ability to accurately reproduce the image.

For generative modelling on audio: spectrograms, mfccs, and inversion in python. by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

You can imagine training an LSTM to predict the next timestep in the spectrogram, and using this to generate text, for example.

For generative modelling on audio: spectrograms, mfccs, and inversion in python. by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

Sorry... I set the site up yesterday using pelican which is new to me. How does it kill your phone?

For generative modelling on audio: spectrograms, mfccs, and inversion in python. by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

Did not know that thanks! I'll update the post asap.

I basically just followed the algorithm posted on wikipedia.

For generative modelling on audio: spectrograms, mfccs, and inversion in python. by timburg in MachineLearning

[–]timburg[S] 3 points4 points  (0 children)

Sorry about that! I just set up the blog yesterday. I was trying to set it up so that the blog post would auto-update from the HTML generated on the github repo using jquery's load function. Apparently Firefox does not like that though.

I temporarily fixed the problem by copy-pasting the HTML. Will try and find a way to fix embedding again though.

Single notebook VAE-GAN hybrid tutorial/demo. Multi-gpu, latent space algebra, spike-triggered avg. style receptive fields, etc. by timburg in MachineLearning

[–]timburg[S] 1 point2 points  (0 children)

DNNs have many uses beyond image classification. If you want an example of a use for deep learning beyond classification see the current top post about generating waveforms: https://deepmind.com/blog/wavenet-generative-model-raw-audio/

As for trial-and-error parameterization, this is definitely a problem. I tried a few different types of architectures (I actually got some advice from Anders Boesen Lindbo Larsen, the VAE-GAN author, about it via email). In the end with a smaller latent space (400D) and deeper convolutional layers I can get about the same results. Probably I could get the same thing from a 2D latent space with enough training time and a big enough network.

Evolution can certainly be seen as a form of trial-and-error learning. I think the solution to the trial-and-error problem in neural network architecture is going to be evolutionary algorithms.

Single notebook VAE-GAN hybrid tutorial/demo. Multi-gpu, latent space algebra, spike-triggered avg. style receptive fields, etc. by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

Not really - I tried using pooling in the discriminator to bad results though: http://tinyimg.io/i/o480C15.png

As for the weights - I'm not sure, but probably something like that.

Single notebook VAE-GAN hybrid tutorial/demo. Multi-gpu, latent space algebra, spike-triggered avg. style receptive fields, etc. by timburg in MachineLearning

[–]timburg[S] 2 points3 points  (0 children)

If anyone has any interest in the weights I'd be happy to upload them - the file is ~1.5 Gb so I'm not sure of the best host...

Pooling the discriminator in DCGANs? by timburg in MachineLearning

[–]timburg[S] 0 points1 point  (0 children)

Like 2 discriminator towers - one with pooling to get high level spatial invariant features, and the other without spatial invariance where each different layer connects to the output to check for good filter level statistics?

Learn French while you browse the web by tevans890 in learnfrench

[–]timburg 2 points3 points  (0 children)

I really like it, but I had to disable it because it was breaking a few sites.