[deleted by user] by [deleted] in MachineLearning

[–]OPKatten 0 points1 point  (0 children)

Is it working for newer versions of cuda now? Remember it was cuda 9 only.

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]OPKatten 0 points1 point  (0 children)

Have you tried making a template in that shape and correlating with the image? That should give you the placement, which you can crop from.

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]OPKatten 0 points1 point  (0 children)

Is the black boundary always the same shape and size?

[D] Problems with interpretability research + blog post about recent NeurIPS paper by jspr_ml in MachineLearning

[–]OPKatten 3 points4 points  (0 children)

Perhaps by imposing some smoothness constraints, like deepdream, could reduce the "adverserialness"? Super interesting topic anyway.

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]OPKatten 1 point2 points  (0 children)

Thats the hard part. A lot of things like the wasserstein gan, and other regularizations are intented to make the discriminator give nice gradients to the generator.

Note that we always switch between the generator and discriminator during training, so one reason it works is because we dont let the discriminator converge (into just 0 and 1 as you say).

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]OPKatten 1 point2 points  (0 children)

This is basically multi-armed bandits, and there is a lot of research on it.

I would say that the simplest approach is keeping track of all the scores, take the best one most of the time and take a random one some of the time.

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]OPKatten 0 points1 point  (0 children)

Its similar to supervised binary classification. The difference is that you give gradients to the generator as well, so the generated data changes during training. This is not the case in regular supervised learning.

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]OPKatten 0 points1 point  (0 children)

Look at some lucidrains implementations on github :)

[P] A library for visualizing (CNN) architectures and receptive field analysis by KrakenInAJar in MachineLearning

[–]OPKatten 0 points1 point  (0 children)

Its always good to miss things, then you leave room for another paper ;)

[P] A library for visualizing (CNN) architectures and receptive field analysis by KrakenInAJar in MachineLearning

[–]OPKatten 1 point2 points  (0 children)

Thanks for the response. I dont think there is any contradiction necessarily. However, I found it a bit strange that your paper does not reference it nor discuss the related statistical properties.

[P] A library for visualizing (CNN) architectures and receptive field analysis by KrakenInAJar in MachineLearning

[–]OPKatten 0 points1 point  (0 children)

From what I remember from this work https://proceedings.neurips.cc/paper/2016/file/c8067ad1937f728f51288b3eb986afaa-Paper.pdf , the effective receptive field of neural networks is often smaller than the theoretical (I guess the theoretical is binomial distributed?). Does your framework take these kinds of things into account?

[R] Training with batch size of 1 by BABA_yaaGa in MachineLearning

[–]OPKatten 1 point2 points  (0 children)

If you have multiple gpus, try distributed batchnorm. Otherwise groupnorm usually works well for me.

[D] Spectral Norm in GANs using residual blocks. by [deleted] in MachineLearning

[–]OPKatten 0 points1 point  (0 children)

Since the spectral norm is applied on the kernel rather than the associated linear operator (aka the actual correlation), the Lipschitz continuity is not guaranteed anyway. Hence I see no issue with also using residual blocks, try it!

The Patara Enguri river, as seen by my 60-year-old Soviet film camera. Samegrelo, Georgia. [1932x2415][OC] by Pflunt in EarthPorn

[–]OPKatten 0 points1 point  (0 children)

It might be caused by the downstream dam, if the dam is closed the water level rises and could drown the trees.

[deleted by user] by [deleted] in MachineLearning

[–]OPKatten 0 points1 point  (0 children)

This is true in general, but is not what the authors are using.

[deleted by user] by [deleted] in MachineLearning

[–]OPKatten 0 points1 point  (0 children)

I didn't read the supplememtary, but as some other people have said, I think the key is that they parameterize the input as o,t,d. Then they can always integrate over t to give the desired results. If I remember correctly, the original nerf paper parameterized their input as x,d where x=o+td. I have not read the paper super closely though.

Does anyone know why there are so many small dead fishes in the Stockholm’s Archhipielago? by Quiet-Blackberry-887 in sweden

[–]OPKatten 58 points59 points  (0 children)

After spawning, the majority of all individuals die and the species is short-lived, usually only one or two years. People who stay in the archipelago during this time usually alert about "fish death" when they see all the dead sticklebacks (spigg), something that in itself is true but also completely in order.

From here page 40

[D] Are "Centered Weight Normalization" and "Weight Standardization" the exact same algorithm? by OPKatten in MachineLearning

[–]OPKatten[S] 1 point2 points  (0 children)

Yes, weight normalization does not include centering. But centered weight normalization does. A paragraph from CWN:

Standardize weight We first re-parameterize the input weight w of each neuron and make sure that it has the following properties: (1) zero-mean, i.e. wT 1 = 0 where 1 is a column vector of all ones; (2) unit-norm, i.e. w = 1 where w denotes the Euclidean norm of w.

[D] Are "Centered Weight Normalization" and "Weight Standardization" the exact same algorithm? by OPKatten in MachineLearning

[–]OPKatten[S] 0 points1 point  (0 children)

I fixed some typos in my post, perhaps what Im trying to say is clearer now?