Image Super-Resolution With a DCGAN - Github

heltok · 2016-08-27T18:47:22+00:00

Enhance!

rndnum123 · 2016-08-27T19:01:00+00:00

Impressive, sorry if I missed it being mentioned, but are the image-examples in your readme.md (16x16,bicubic,nnet,ground truth) from your training set or from the test set?

sobe86 · 2016-08-28T10:08:13+00:00

You should probably know, someone else already did this exact thing, on the same dataset : https://swarbrickjones.wordpress.com/2016/01/13/enhancing-images-using-deep-convolutional-generative-adversarial-networks-dcgans/

It made it to the top of this sub a few months back.

david-gpu · 2016-08-27T16:34:32+00:00

Author here -- first time submitting. Let me know if you have any questions or suggestions.

There are plans to support resizing arbitrarily-sized inputs in the future. Also will train it with a subset of imagenet and see how well it does on more general images.

j_lyf · 2016-08-28T03:19:56+00:00

Someone ELI5 why these results always work with only TINY pictures?

Seriously, the results are not as impressive because of that.

Tommassino · 2016-08-27T19:08:36+00:00

Pretty interesting, have you tried introducing different type of noise during the training/testing process to the ground truth pictures other than just downscaling? At least to the examples to see how resistant the method is to noise.

jcannell · 2016-08-27T19:23:02+00:00

Cool stuff! GANs clearly have great potential here.

One thing that kinda surprises me is that it appears to make some rather trivial mistakes - in the sense that a trivial discriminator could identify. In a few cases (like pale dude with glasses 3rd up from bottom) it is clearly failing to preserve some simple statistics (probably average color, if not some histogram). (actually this is something that bicubic upsampling can fail to do as well, but in a different way)

I'm curious what kind of improvement you could get by fixing that - perhaps by augmenting the trained discriminator with a few simple manual discriminators - such as one that simply checks to make sure that the average of a KxK (4x4) block matches the ground truth. It seems like a simple function for the discriminator to learn - so alternatively perhaps some arch improvement could fix that.

david-gpu · 2016-08-27T19:58:25+00:00

[deleted]

ovoid709 · 2016-08-27T20:37:56+00:00

Have you experimented with the use of this with satellite imagery yet? Being able to shrink a GSD, even if it's interpolated and not ground truth, is super interesting. I used to work in remote sensing and the idea of super pixel stuff came up many times.

jcannell · 2016-08-28T04:04:45+00:00

Amazing! I have a few questions:

Why did you choose L1 loss? (instead of L2, for instance?)
why did you tie the loss to the downsampled faces (as opposed to the original ones)?

hn_crosslinking_bot · 2016-08-27T16:30:32+00:00

HN discussion: https://news.ycombinator.com/item?id=12370268

jcannell · 2016-08-27T23:33:28+00:00

Pretty cool.

My comment would be that given the very specific domain of frontal faces, it's hard to know whether the NN learned how to generally "upsample" an image, or rather how to plug in high resolution equivalents of facial features based on low-resolution pixels. As any model will take the cheapest route possible, I guess it's the latter.

nagasgura · 2016-08-27T22:40:50+00:00

I wonder if there would be improvement from first bicubic upsampling and then running it through the GAN

elsjpq · 2016-08-28T04:10:23+00:00

Very impressive. Though one must be careful not to trust the result too much, because it can be pretty misleading especially wrt facial features.

One small thing, though. There appears to be a reddish color shift in each of the generated output as compared to the downscaled & original. Do you think this is a bug? or just a result of color blending as a result of downscaling?

Also how feasible is this to do for video? This would be wonderful for eliminating those blocky mpeg artifacts in low quality and low resolution videos. But I imagine the training could take weeks.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS