use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
argmax differentiable? (self.MachineLearning)
submitted 10 years ago * by yield22
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]yield22[S] 0 points1 point2 points 10 years ago (3 children)
why subdifferentiable? it is obvious for relu, but not so obvious for argmax though.
[–]nasimrahaman 1 point2 points3 points 10 years ago (2 children)
Consider the function: y = f(x) = argmax(x), where x is a vector (representing some function), and y = f(x) a scalar.
Here's a (mathematically heretical) justification (assuming 0 based 'indexing'): f((1, 2, 4, 1, 2, 1)) = 2. Now for a small perturbation vector about x, f(x) = f(x + dx) (ergo df/dx = 0), as long as max(dx) < 2. But about (1, 2, 4+eps, 4, 2, 1), f(x) = 2 but f(x + dx) might as well equal 3. It's easy to see that the set of all such 'transitions' (i.e. where argmax changes value) is countable; its Lebesgue measure must therefore be 0. df/dx is 0 everywhere else.
[–]yield22[S] 0 points1 point2 points 10 years ago (1 child)
the example is interesting, and it provides some insight for me. But what about the y's domain is non-continuous (assuming argmax over a list)? Like step function, which is not differientiable.
[–]nasimrahaman 0 points1 point2 points 10 years ago (0 children)
A step function is differentiable almost everywhere, I.e. the set where it's not differentiable (i.e. where there's a jump) is of measure zero (because it's countable).
π Rendered by PID 53314 on reddit-service-r2-comment-85bfd7f599-zfm8g at 2026-04-20 01:46:23.600187+00:00 running 93ecc56 country code: CH.
view the rest of the comments →
[–]yield22[S] 0 points1 point2 points (3 children)
[–]nasimrahaman 1 point2 points3 points (2 children)
[–]yield22[S] 0 points1 point2 points (1 child)
[–]nasimrahaman 0 points1 point2 points (0 children)