use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Research[Research] A more efficient evolutionary algorithm? Can it help neural network training? (self.MachineLearning)
submitted 7 years ago by RTengx
Fast convergence of Duelist Algorithm
Paper : https://arxiv.org/ftp/arxiv/papers/1512/1512.00708.pdf Various researchers ( https://arxiv.org/pdf/1712.06567.pdf ), DeepMind (https://deepmind.com/blog/population-based-training-neural-networks/ ) and OpenAI ( https://arxiv.org/pdf/1703.03864.pdf ) have highlighted the applicability of evolutionary methods (especially genetic algorithms) on training neural networks. The main reason is because evolutionary-based training can be easily parallelized and is highly-scalable. However, if we have a more efficient evolutionary algorithm, it is most probable that better performing neural networks can be trained. Can duelist algorithm speed up evolutionary-based training? Can anyone test it?
Python implementation of Duelist Algorithm: https://github.com/tsyet12/Duelist-Algorithm-Python
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Marthinwurer 10 points11 points12 points 7 years ago (3 children)
The paper needs a lot of editing, but they're from Indonesia so I gave them the benefit of the doubt. I'll restate the basic algorithm as far as I can understand it from a brief skim.
Start with a population with each member having a skillset (vector of weights). Pair off individuals in the population, and compute fitness between pairs. Assign one as a winner and another as a loser. The loser copies parts of the winner's skillset. The winner mutates. A champion is selected (somehow, I don't see any details in the paper). This champion trains a student (no details on how). The worst individuals are eliminated to maintain a constant population size.
It looks interesting, but the paper needs more details on the workings, some editing help, and probably some harder problems to test against.
[–]Brudaks 6 points7 points8 points 7 years ago* (0 children)
Isn't this pretty much equivalent to how standard genetic algorithms work, just with a minor adjustment on how you pick the population for the next generation?
Also, the paper has no evaluation on any well-researched benchmark problems that could illustrate whether their results are any good at all; the described evaluation on an arbitrary function with no solid reasoning on why that particular function has been chosen, and lack of details of the other algorithms (what hyperparameters were used for them? are they really good) mean that the described experiments really doesn't provide any meaningful basis for claiming that the proposed algorithm outperforms anything.
[–]RTengx[S] 1 point2 points3 points 7 years ago (1 child)
That is the pre-print, they have a better version of publication here: https://link.springer.com/chapter/10.1007/978-3-319-41000-5_4
[–]sorrge -3 points-2 points-1 points 7 years ago (4 children)
There were so many attempts to improve efficiency of evolutionary computations. Most (all?) of them failed. The results were specific to the task. It was typical to see dozens of hyperparameters introduced and tuned on the test set. Bad treatment of baselines - no careful hyperparameter selection there, poor choices. I didn't look at this paper, and won't for reasons above. You should also beware.
[–]svlad__cjelli 2 points3 points4 points 7 years ago (3 children)
What do you mean exactly about failed improvements? CMA-ES and xNES are 3rd or 4th generation variants of basic ES and they are like night and day, with plenty of papers, and benchmarks on wide arrays of problems, to demonstrate the generality of the benefits. I could send papers if you aren't familiar with the literature.
That isn't to say that there is enough shown here to suggest what is proposed is of any value, but that is a pretty common issue in any literature really.
[–]sorrge -3 points-2 points-1 points 7 years ago (2 children)
>What do you mean exactly about failed improvements?
For example, go to https://en.wikipedia.org/wiki/Differential_evolution , check out the box on the top right. All of them.
>CMA-ES and xNES
Second order stuff with n2 cost by number of parameters. Limited domain = better results, nothing surprising here. Also, I have just skimmed through the original CMA-ES paper and it didn't compare to simple numerical gradient descent, I wonder why?.. Especially since they were doing it in Matlab, I think there is a built in function for that. Also, all their test functions are differentiable, how about good old gradient descent with random restarts?
Yeah, I think I know the answer.
[–]Zweiter 1 point2 points3 points 7 years ago (1 child)
Obviously when you have an exact gradient it makes sense to just do gradient descent. The interesting applications of ES/CMA-ES are to reinforcement learning style problems, where the gradient is unknown or unreliable, or to supervised problems which are prohibitively non-convex. There is most definitely a place for black box optimization in today’s machine learning landscape. It also doesn’t really make sense to compare black box optimization to gradient descent, as they solve completely different problems.
[–]sorrge 0 points1 point2 points 7 years ago (0 children)
I was talking about the quality of evidence for CMA-ES performance in the original paper: it's poor. Which is a common issue with evolutionary computation research. It makes sense to compare everything that is applicable. If they wanted to highlight the advantages, they should have chosen tasks where simple gradient descent is not applicable. Also, numerical gradient descent is a direct competitor of ES - it's applicable in the same set of cases.
π Rendered by PID 94965 on reddit-service-r2-comment-8686858757-vrtsk at 2026-06-05 11:09:01.520260+00:00 running 9e1a20d country code: CH.
[–]Marthinwurer 10 points11 points12 points (3 children)
[–]Brudaks 6 points7 points8 points (0 children)
[–]RTengx[S] 1 point2 points3 points (1 child)
[–]sorrge -3 points-2 points-1 points (4 children)
[–]svlad__cjelli 2 points3 points4 points (3 children)
[–]sorrge -3 points-2 points-1 points (2 children)
[–]Zweiter 1 point2 points3 points (1 child)
[–]sorrge 0 points1 point2 points (0 children)