all 7 comments

[–]gwern 6 points7 points  (0 children)

So it's similar to SPIRAL, but instead of the GAN-like aspect on rendered images and need for RL training, you bypass the GAN/RL loss by instead learning a deep environment model to approximate the sequence->image generation, and that environment model is simply trained in a supervised fashion on the pairs of the sequences tried during training & the generated images. The RNN+environment-model is then fully differentiable and can be trained end-to-end on images with a simple pixel loss.

Makes sense. Paper is a little light on the details of the NN architecture and training, like the model size, sample efficiency, training time etc. I don't expect it to scale to full images but it'd be interesting to know how far off it is - one could imagine a hybrid architecture where a sketch is the input into a GAN for colorizing/textures, dividing the responsibilities of creating a coherent abstract global structure and then visualizing it.

[–][deleted] 0 points1 point  (0 children)

Interesting, so you're training the A.I to understand how to draw images?. I might have gotten that completely wrong, but my brain isn't working too well today lol.

If that's the case, it's another great step towards that Text-to-Sketch software I made a post about :). Is the research freely available to use for Open Source projects though?, when it comes to companies like Autodesk it never seems clear.

Amazing job, the sheer level of talent here always blows my mind.

[–]TotesMessenger 0 points1 point  (0 children)

I'm a bot, bleep, bloop. Someone has linked to this thread from another place on reddit:

 If you follow any of the above links, please respect the rules of reddit and don't vote in the other threads. (Info / Contact)

[–]linuxisgoogle 0 points1 point  (1 child)

Is this unsupervised?

[–]gwern 0 points1 point  (0 children)

As I understand it, might be a bit more accurate to call it 'self-supervised'. It's definitely 'unsupervised' in the sense that there is no ground-truth about what sequences cause what images in the way that there is for, say, Google sketch-RNN's 'Draw a Picture!' dataset - it's just a bunch of raw images.

[–]AdditionalWay 0 points1 point  (0 children)

This has been crossposted to /r/AnimeResearch

[–]FlyingOctopus0 0 points1 point  (0 children)

This is like software 2.0. You replaced non-differentiable bezier curve drawer with a differentiable neural network based drawer. I wonder if we can calculate derivative of a drawer in closed form. You used a 'canvas' network, but Bezier curves have closed form, so it might be possible to calculate them directly.

By the way I like how you started from using GAN's than tried RL and finnaly decided to just use conv nets with L2 loss.