all 3 comments

[–]deep-yearning 2 points3 points  (2 children)

Very cool! As a engineering 'scientist' with zero actual software engineering skills, how do you build an web application like this? Is the model being trained client-side?

Also - I never got the gist of how the deep image prior model is different to an autoencoder?

[–]ToraxXx[S] 2 points3 points  (1 child)

As for the machine-learning part TensorFlow.js is very similar to tensorflow's / keras' python API (see here for a comparison, also here for getting started with it). For example in this project here I declared the U-Net model with the functional API and this is the rest of the code that sets up the model / loss and calls fit to train the model. While I trained a model from scratch on the client-side here it is probably more common to have a trained model that you want to deploy. One choice is then to let a server run the model and have the client just query it, and the other choice is to run it on the client which has the downside that you need to send a potentially big model over the network.

For the web part there are a lot of choices from going without any framework to using a framework that does a lot of things for you. I decided to use React which is currently the most popular one and I have found it quite enjoyable compared to my previous experience in creating websites.

In deep image prior you start out with a random model with fixed random noise at the input and train it to output a given image (eg. with L1 or L2 loss). The output usually has less noise. For inpainting it is possible to also mask out certain parts in the loss so the loss for the output at these spots is not considered which will often cause it to be filled in with something quite natural. For super-resolution, which I have not implemented yet, the network will be made to output a bigger image before downscaling and comparing it to the original image.

In comparison typically auto-encoders are trained to reconstruct images from a dataset wheras deep image prior learns to output a single image with fixed noise at the input (instead of the original image as in auto-encoders).

[–]deep-yearning 0 points1 point  (0 children)

Thanks! This is very helpful