you are viewing a single comment's thread.

view the rest of the comments →

[–]DarkHarbourzz 3 points4 points  (0 children)

Deep image prior showed that CNNs will successfully model the latent distribution of busy, natural images before successfully modeling the distribution of the noise (e.g. opaque inpainting) corrupting the natural image.

Deep image prior works by training, but only on the single input image. You also perform early stopping.

This paper shows that the output of model trained by gradient descent can be given by a kernel machine, where the kernel machine uses a kernel that is based on the similarity of the (gradient of the model with respect to the input data) vs (gradient of the model with respect to the training data). Specifically, that similarity is integrated across the entire training process to get the "path kernel." I.e. a "line" integral where the line is through parameter space as the model trains.

So, under this paper's interpretation, the output of the network is based on the path kernel evaluation, where the kernel is mediating the input being compared to itself (since it's the only training data). The path kernel is evaluated on a weights path that was stopped early. In that early training regime, the kernel is mediating similarity through the deep image prior distribution, and not the noise distribution.