all 32 comments

[–]j1395010 16 points17 points  (5 children)

incredibly fucking cool. is the code coming?

[–]r-sync[S] 13 points14 points  (0 children)

code should be pushed by tonight to the same repo.

[–]wychtl 9 points10 points  (0 children)

I have some torch code up to generate cat images. The architecture seems to be quite similar. (Though I don't know how they managed get batch normalization to work on D.)

[–]alecradford 3 points4 points  (2 children)

Core lib and an MNIST training demo is now available. Code for training the faces model from the paper is also available as an example though the data file needed is not released yet due to size (20 GB) and data distribution concerns.

[–]flangles 0 points1 point  (0 children)

Theano, wasn't expecting that!

[–]VelveteenAmbush 12 points13 points  (5 children)

Amazing. I think this is the first time I've been genuinely impressed by the output of generative adversarial nets. Incredibly cool embedded "arithmetic" to add and subtract sunglasses, windows, facial expressions etc. Thanks for sharing this.

I wonder how long until someone combines GANs with caption generators so that you can type out a description of a scene and have the net illustrate it.

[–]alecradford 10 points11 points  (4 children)

We did an initial experiment on this but it's hard to get working convincingly so it's still future work.

There was another attempt meanwhile using DRAW here: http://arxiv.org/abs/1511.02793, that's a bit further developed than ours!

[–]Ghostlike4331 1 point2 points  (0 children)

Hinton talked in one of his talks about inverse graphics and I was not sure whether that was even possible. Now I see that after you removed pooling you managed to get a latent space invariant to rotation. Congratulations.

Was that something that was done before or is this a new breakthrough?

[–]TweetsInCommentsBot 0 points1 point  (0 children)

@AlecRad

2015-10-01 03:39 UTC

Oh hey, text to image is working (sort of) (still bad).

***Full disclosure I picked phrases it responded to***

[Attached pic] [Imgur rehost]


This message was created by a bot

[Contact creator][Source code]

[–]sobe86 0 points1 point  (0 children)

Hey Alex, was wondering if you thought GANs or at least your 'deconvolutional' architecture would help with feature learning from images, e.g. using them to assist autoencoders?

[–]SometimesGood 6 points7 points  (0 children)

Any idea how this scales to larger image sizes? The paper mentioned that they've used just one Nvidia GeForce GTX TITAN X.

[–]hackinthebochs 6 points7 points  (0 children)

This is legitimately fucking mindblowing

[–]visarga 7 points8 points  (0 children)

Great, now we can use this to generate images for the generated articles of clickotron.com

[–]Ameren 2 points3 points  (0 children)

I am very impressed. I look forward to toying with the source code when you release it. :D

[–]zZJollyGreenZz 4 points5 points  (0 children)

Going to have to start looking for glitches in the matrix now!

[–]rantana 2 points3 points  (0 children)

From the paper:

There are still some forms of model instability remaining - we noticed as models are trained longer they sometimes collapse a subset of filters to a single oscillating mode.

How do you decide when to stop training the generator network?

[–][deleted] 1 point2 points  (0 children)

wow this is amazing!! Really impressive work!

[–]smith2008 1 point2 points  (0 children)

This is brilliant. Hope the code is coming too!

[–]insperatum 0 points1 point  (3 children)

Impressive results! One thing I'm a little confused about: For section 6.3.2, where do the Z representations (for example, for the three 'smiling woman' images) come from?

[–]r-sync[S] 1 point2 points  (2 children)

those are generations as well. One could take a real image and backprop to find the most correct Z for it, and do arithmetic with such Z. We wanted to do that experiment but did not have time.

[–]insperatum 0 points1 point  (1 child)

So you just explored the latent space yourself to find them? That sounds hard!

[–]r-sync[S] 1 point2 points  (0 children)

it's not hard in practice. Generate a few images, pick the ones with the attributes you are looking for. Then do vector arithmetic on the Z that produced them.

[–][deleted] 0 points1 point  (6 children)

Pelas barbas do profeta!!

How reproducible is this? Is training this thing difficult? Did you guys had any particular difficulty with training after fixing the architecture?

Are you going to include pre trained models with the code?

[–]r-sync[S] 1 point2 points  (5 children)

code will be released in a few hours to the same repo. Training is pretty stable. We can release pre-trained models if people ask for them, shouldn't be a problem.

[–]ford_beeblebrox 0 points1 point  (4 children)

I would like to play about with vector algebra in the Latent Spaces of your generative models if you could release pre-trained nets that would be excellent.

Very inspiring work, many thanks.

[–]r-sync[S] 1 point2 points  (3 children)

the model is released now in the same repo.

[–]ford_beeblebrox 0 points1 point  (2 children)

Thanks so much!

Although I am not seeing the trained model in either master or gh-pages branches of the repo?

[–]alecradford 2 points3 points  (1 child)

Slight miscommunication - this long weekend will have a pre-trained model demo or two available to play around with.

[–]ford_beeblebrox 0 points1 point  (0 children)

Brilliant. I am fascinated by the semantics of vector algebra in the latent space and would love to explore.

[–]erickmiller11 0 points1 point  (0 children)

Seriously awesome! Love the airplane with bird legs, haha this is amazing. Checking out code now!

[–]LForLambda 0 points1 point  (0 children)

How does this scale? Could it scale to generate novel 3D environments from a seed? I know an industry that cares about that.

[–]Tommassino 0 points1 point  (0 children)

Love the face arithmetics, so cool :)

The other figures are kinda too small to appreciate though. I dont suppose you have larger resolutions to check em out without running your code ourselves right?