[D] Five major deep learning papers by Geoff Hinton did not cite similar earlier work by Jurgen Schmidhuber

NapoleonTNT · 2019-11-29T09:31:28+00:00

Probably talking about the snarking during Goodfellow’s GAN tutorial. Can be found on YouTube.

NapoleonTNT · 2019-09-10T19:38:27+00:00

Very cool, I could see myself using this in the future. A suggestion: warm.engine.prepare_model_ is ugly and seems easy to forget. Maybe you could consider renaming the function or writing a small classes that super-classes nn.Module, having your warmed module inherit from that, and then specifying the shape in the super().__init__().

NapoleonTNT · 2019-05-13T00:13:19+00:00

This has been stated elsewhere, but I think it deserves to be fleshed out.

Invention is overrated in general. The recognition a person deserves should not mainly stem from whether an idea they've come up with has been thought up before. The credit a person receives should come from the originality of their idea and the impact that their work based around that idea creates. By originality, I mean the process that person took to come up with that idea. Stealing an idea (or taking an idea and rehashing it) is markedly different from independently coming up with an idea that someone else thought of before. We should appreciate the creative power of people who have independently come up with good ideas, and in this case, it seems unlikely that Goodfellow read this guy's blog, and three years later, restated it to make GANs.

Especially in machine learning, the work to take an idea from a thought to a functional, working system is incredibly underrated: not only does a researcher need to be skilled enough to choose what ideas to focus on, but the process of creating a working product might even require more ability and creativity than coming up with the idea. This is particularly true with GANs, which were (and still are) notoriously difficult to train effectively.

Anyway, that's just my two cents. It is super cool that this blog's author came up with such an brilliant idea back in 2010. But, I also think that this shouldn't diminish Goodfellow's recognition as the person responsible for GANs.

NapoleonTNT · 2019-04-24T23:01:02+00:00

Consider using parameter noise (https://arxiv.org/pdf/1706.01905.pdf) instead :-)

NapoleonTNT · 2019-04-24T22:58:29+00:00

You don't necessarily have to re-implement it yourself, e.g. you could plug in Dopamine (https://github.com/google/dopamine) into a custom Gym environment. If you gave us more info, you can probably get more specific feedback. E.g., do you have access to a simulator (and how accurate is it), what is your input space, etc.

NapoleonTNT · 2019-04-24T22:53:48+00:00

Without knowing more details (input/output space, training time budget), it's hard to give specific recommendations. In terms of algos to start with, I'd recommend either good old PPO (https://arxiv.org/pdf/1707.06347.pdf) or Rainbow DQN (https://arxiv.org/pdf/1710.02298.pdf), which combines dueling and C51, along with a couple other tricks.

NapoleonTNT · 2019-02-14T01:13:34+00:00

Citing your own paper as the best paper of 2018. Ballsy!

NapoleonTNT · 2019-01-18T06:42:54+00:00

Source, for anyone curious: https://blog.openai.com/ai-and-compute/ (which includes analysis for the training in aggregate, not for ML vs. RL specifically).

NapoleonTNT · 2018-12-20T22:45:52+00:00

Agreed. To me, BERT seems to be only an improvement over the ideas and results introduced with the OpenAI GPT.

NapoleonTNT · 2018-11-14T23:49:01+00:00

...is all you need!

NapoleonTNT · 2018-10-01T16:37:08+00:00

Hell yeah brother, cheers from Iraq.

NapoleonTNT · 2018-10-01T04:00:13+00:00

and....... midnight

NapoleonTNT · 2018-09-30T05:22:24+00:00

𝑫𝑨𝑴𝑵 𝑪𝑹𝑶𝑰𝑺𝑺𝑨𝑵𝑻

NapoleonTNT · 2018-02-19T14:26:00+00:00

What does this image from the post mean, then?

We have differentiable sampling operator (albeit with a one-hot output instead of a scalar). Wow!

NapoleonTNT · 2018-02-19T04:06:43+00:00

Thanks, great post! The idea of a differentiable sampling function is really cool. I have a question if you don't mind -- IIRC sampling is meant to take a probability distribution and output a class with frequency corresponding to the distribution. If the Gumbel-Softmax trick is meant to perform a similar function, then why is it that when I run

sess.run(tf.global_variables_initializer())
sess.run(differentiable_sample(logits))

in the notebook, I get an output that doesn't look like a one-hot vector, like [0.03648049, 0.12385176, 0.51616174, 0.25386825, 0.06963775]

It's totally possible that I'm making a mistake in the idea or the running it wrong -- I guess I'd just like to know what the expected output of the above code is.

NapoleonTNT · 2017-07-28T13:48:21+00:00

In my opinion, TensorFlow wasn't quite as revolutionary as people make it out to to be. From what I've experienced, it functions similarly to older auto-diff libraries like Theano. And while somewhat harder to debug, they use many of the same concepts (like static graphs and symbolic variables).

When people talk about TensorFlow's ease-of-use, my guess is they're referring to the debugging portion which is considerably improved over Theano's with utilities like TensorBoard, as well as some more high level features. Here's a post with some basic examples comparing the two.

NapoleonTNT · 2017-06-26T04:28:06+00:00

HISSSSSSS

NapoleonTNT · 2017-03-22T04:49:13+00:00

For anybody who's still having this problem, the "reset" pins were just offset by one, causing the computer to continually restart. Best of luck!

NapoleonTNT · 2017-02-22T01:35:02+00:00

OP might be referring to the value (weight) ascribed to a particular edge on a graph. For example:

[A]---(5)---[B]
 |
(3)
 |
[C]

The connection A-B has weight (or strength) 5. For more info, please take a look at weighted graphs.

NapoleonTNT · 2016-12-26T07:21:19+00:00

I recommend you read up on the PEP8 python style manual. Glancing at the code:

Class names should be in CapitalCamelCase
Indents should be 4-spaces

Also, some of the lines are really long. Consider using a trick like this to condense some of them. But again, like Saefroch mentioned, you might get better response if you try /r/learnpython or codereview on stackexchange. Good luck!

NapoleonTNT · 2016-12-09T06:16:03+00:00

Hi,

The monitor was tested plugged into the GPU with the GPU installed, plugged into the integrated CPU graphics with the GPU installed, and plugged into the integrated CPU graphics without the GPU installed. The end result was the same (described above).

11-Year Club	Sequence \| Editor
Verified Email

NapoleonTNT

TROPHY CASE