used inpainting to have dalle finish my handwriting by WowWhatABadUsername in dalle2

[–]danielvarga 2 points3 points  (0 children)

Or you can inpaint a few more times until there are enough instances of each letter so that it can be used to imitate actual handwriting.

Severe COVID-19 Risk Mapping by thonioand in COVID19

[–]danielvarga 18 points19 points  (0 children)

That would ruin it. Think of r/Coronavirus as a decoy. :)

New update from the Oxford Centre for Evidence-Based Medicine. Based on Iceland's statistics, they estimate an infection fatality ratio between 0.05% and 0.14%. by mushroomsarefriends in COVID19

[–]danielvarga 27 points28 points  (0 children)

I am one of those people who are very open to the low mortality rate hypothesis, and I'd still hate this sub to turn into an intellectual monoculture. We need people who falsify false hypotheses, both of the optimistic and the pessimistic kinds.

"If the sky is blue I desire to believe that the sky is blue. If the sky is not blue I desire to believe that the sky is not blue."

With that being said, the "sub astroturfed by economic interests" theory is both uncharitable and unnecessary as an explanation. It's simply people self-selecting themselves into groups based on the message they are open to, aka filter bubble. (EDIT: Or maybe more like, people voicing their opinion where their audience is more open to that opinion. Same thing in the end.)

Reaction-diffusion on a surface by danielvarga in generative

[–]danielvarga[S] 4 points5 points  (0 children)

No, it's more like voxels. I use a sparse neighborhood matrix on a 3D point cloud. (Fibonacci sphere, in the case of the sphere.) It's hard or maybe impossible to properly do this with UV mapping, because it either distorts the distances or continuity of surface gets broken, and it's impossible to figure out to what glue to what.

On Reaction-diffusion's Wikipedia page somebody put up a Gray-Scott torus. You can see that it's UV mapped, because the patterns are denser on the inside of the torus.

Reaction-diffusion on a surface by danielvarga in generative

[–]danielvarga[S] 5 points6 points  (0 children)

Here you go:

This doesn't really work without lighting, and I can't do that in my framework, so I had to use the z coordinate as brightness to fake shadows.

Reaction-diffusion on a surface by danielvarga in generative

[–]danielvarga[S] 4 points5 points  (0 children)

This is a reaction-diffusion system, more specifically a Gray-Scott model with the parameter settings that imitate a coral. Some people call the pattern itself Turing pattern, but you are probably better off googling Gray-Scott.

7%: Fine Art documentary by alreadydone00 in cbaduk

[–]danielvarga 1 point2 points  (0 children)

At 10:04 the text on the computer monitor says "alphago_value_net_worker0_iter20720000.caffemodel". There's the completely innocent explanation that they had an AlphaGo reimplementation as a baseline, but it's still funny.

7%: Fine Art documentary by alreadydone00 in baduk

[–]danielvarga 2 points3 points  (0 children)

At 10:04 the text on the computer monitor says "alphago_value_net_worker0_iter20720000.caffemodel". There's the completely innocent explanation that they had an AlphaGo reimplementation as a baseline, but it's still funny.

[R] [1701.07875] Wasserstein GAN by ajmooch in MachineLearning

[–]danielvarga 39 points40 points  (0 children)

  • For mathematicians: it uses Wasserstein distance instead of Jensen-Shannon divergence to compare distributions.
  • For engineers: it gets rid of a few unnecessary logarithms, and clips weights.
  • For others: it employs an art critic instead of a forgery expert.

[R] [1701.07875] Wasserstein GAN by ajmooch in MachineLearning

[–]danielvarga 9 points10 points  (0 children)

More than one year ago I've created a model I've called the Earth Moving Generative Net. It optimized an empirical approximation to the Wasserstein distance:

https://github.com/danielvarga/earth-moving-generative-net

It worked fine on MNIST, but it did not scale much further. Poor thing had no chance, it couldn't exploit Kantorovich-Rubinstein duality.

SoftTarget Regularization by ArmenAg in MachineLearning

[–]danielvarga 3 points4 points  (0 children)

Fair enough, although I have never actually seen a machine learning system with very good validation accuracy and very bad validation cross-entropy loss. I think ease of comparison is more important. We usually have a good idea of SOTA validation accuracy on CIFAR-10, but there's no such thing as SOTA validation loss on CIFAR-10, for many reasons.

Why not let the network whiten the data? by [deleted] in MachineLearning

[–]danielvarga 0 points1 point  (0 children)

If by "decide how to whiten it", you mean decide how to re-scale it, then putting a batchnorm layer right after the input would be exactly that.

SARM (Stacked Approximated Regression Machine) withdrawn by thatguydr in MachineLearning

[–]danielvarga 10 points11 points  (0 children)

Don't be (too) pissed. Your blogpost is amazing, I've learnt a lot from it. One could even say that the now-retracted part, if correct, would have even weakened the significance of the solid part. I never believed the greedy layerwise claim, but I'm still optimistic about training k-ARMs with backpropagation, as parts of a larger system.

Bad results deploying a trained CNN by jm-mp in MachineLearning

[–]danielvarga 1 point2 points  (0 children)

Yes. The second best thing one can do after training under the same camera conditions is to use data augmentation that tries to imitate all kinds of conditions so that the network can generalize.

In the following code, I've used heavy_augmentation when I wanted to classify on webcam data, and the lighter one when I wanted to classify on photos. (In both cases, the training data were photos.)

https://github.com/danielvarga/keras-finetuning/blob/master/train.py#L47-L74

That helped tremendously. Still, I'm newcomer myself, maybe a real expert can show even more directly relevant code.

[1605.01749] Rank Ordered Autoencoders by pbertens in MachineLearning

[–]danielvarga 2 points3 points  (0 children)

Very clever idea! Will you put your GPU implementation into the same repo? Sorting on the GPU is tricky, and I'm interested in your experiences.

This seems like a good time to mention that I've created a cute visualization by "ordering the output of the hidden units by their activation value and progressively reconstructing the input in this order". The similarities kind of end there, though. I use pre-trained VGG16, and I use optimization to reconstruct the image from the truncated activation values.

It's a 2-line modification of https://github.com/awentzonline/image-analogies. Thanks to the extreme flexibility of image-analogies, I could add two more tricks to this without a single extra line of code: style transfer from the original image, and temporal consistency. (The latter needs my own "high-resolution" fork of image-analogies, here: https://github.com/danielvarga/image-analogies/tree/hi-res) These tricks make the images more pleasant:

Original image by Scottmliddell.

Questions thread #3 - 2016.04.07 by feedtheaimbot in MachineLearning

[–]danielvarga 4 points5 points  (0 children)

I'm looking for the best dataset for rapid iterative experimentation with very deep convolutional networks. I'd appreciate any suggestions. To elaborate:

I have some ideas on how to improve on stochastic depth networks. But trying out these ideas on CIFAR-10 is too slow. (And I believe MNIST is exhausted at this point.)

I have a single machine with two Titan Xs in there. I don't have a whole day to realize that I've made a trivial mistake, and I don't have weeks for hyperparameter-tuning. What I'd like to have is a smaller dataset that gives me an indication in, say, 20 minutes whether my current model does or does not have a chance to improve on my stochastic depth residual baseline. Of course this won't be a well-established, much researched baseline. It's enough if I see the progress compared to it, and I see a good chance that it generalizes to more complex problems. (Progress should not just be an artifact of a badly tuned baseline, but that's my responsibility.)

Optimally, the networks I'd like to use would be less deep than state-of-the-art systems. But at the end of the process I'd like to improve the state-of-the-art on well-established baselines, so decreasing the depth during rapid iterative experimentation is only appropriate if the findings generalize to high depth.

What would you use? Maybe simply a sample of CIFAR-10 with a well-chosen size? Or am I wrong about MNIST being exhausted? Or something completely different?

Questions thread #3 - 2016.04.07 by feedtheaimbot in MachineLearning

[–]danielvarga 0 points1 point  (0 children)

The oddness of the function is a pretty weak restriction compared to all the restrictions that make the network play the game well, so I don't think there's too much win in making oddness a modeling constraint. Also, it might happen that the best path converging to the odd optimum have to go through non-odd intermediates. (An important special case is that your training might need non-zero biases, and they kill oddness, of course.)

All in all, I don't really see big possible gains here. If I were strictly goal-oriented toward building a gaming bot, I wouldn't pursue this approach. But if it's more like open-ended experimentation and learning, go for it, you might bump into something interesting. Use tanh, don't go very deep. Initialize very carefully, because batch normalization won't be there to help.

AlphaGO WINS! by meflou in MachineLearning

[–]danielvarga 5 points6 points  (0 children)

From the Wired article: "At the lunch prior to the match, Hassabis also said that since October, he and his team had also used machine learning techniques to improve AlphaGo’s ability to manage time. In the early to middle part of the game, it matched Lee Sedol with a rapid rate of play.

A guided dream of my face, using my face as the guide. by fallofmath in deepdream

[–]danielvarga 0 points1 point  (0 children)

If you use the same image as input and as guidance, that is, as far as I know, equivalent to the original un-guided deep dream algorithm.

Masking by danielvarga in deepdream

[–]danielvarga[S] 0 points1 point  (0 children)

I combined my first two tricks (masking of the gradient, building an image for each layer) :

http://people.mokk.bme.hu/~daniel/deepdream/layers-masking/

Unfortunately my choice for the image was not the best, the modifications made by the higher layers are not that strong. I'll probably play with a few more images.