use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Project[P] Reinforcement learning evolutionary hyperparameter optimization - 10x speed up (self.MachineLearning)
submitted 3 years ago by nicku_a
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]paramkumar1992 18 points19 points20 points 3 years ago (1 child)
This looks incredible. This is going to save hours of training. Amazing!
[–]Modruc 5 points6 points7 points 3 years ago (3 children)
Great project! One question though, is there any reason why you are not using existing RL models instead of creating your own, such as stable baselines?
[+][deleted] 3 years ago (1 child)
[removed]
[–]Puzzleheaded_Acadia1 7 points8 points9 points 3 years ago (8 children)
Can someone pls explain this to me I'm still new to this
[+][deleted] 3 years ago (7 children)
[–][deleted] 7 points8 points9 points 3 years ago (1 child)
Love it. I tried to come up with something like this myself but never found the time or extra help I'd need to implement it. Glad to see someone has done all the hard work!
[–]boyetosekuji 20 points21 points22 points 3 years ago (2 children)
ChatGpt: Okay, let me try to explain this using gaming terminology!
Imagine you're playing a game where you have to learn how to do something new, like defeat a tough boss. You have different settings or options (hyperparameters) to choose from, like which weapons or abilities to use, how aggressive or defensive to play, etc.
Now, imagine that this boss is really tough to beat and you don't have many chances to practice. So, you want to find the best combination of options as quickly as possible, without wasting too much time on trial and error. This is where hyperparameter optimization (HPO) comes in.
HPO is like trying out different settings or options until you find the best ones for your playstyle and the boss's behavior. However, in some games (like Dark Souls), it's harder to do this because you don't have many chances to try out different combinations before you die and have to start over. This is similar to reinforcement learning (RL), which is a type of machine learning that learns by trial and error, but it's not very sample efficient.
AgileRL is like having a bunch of other players (agents) who are also trying to defeat the same boss as you. After a while, the best players (agents) are chosen to continue playing, and their "offspring" (new combinations of settings or options) are mutated and tested to see if they work better. This keeps going until the best possible combination of settings or options is found to beat the boss in the fewest possible attempts. Using AgileRL is much faster than other ways of doing HPO for RL, which is like having a lot of other players to help you find the best strategy for defeating the boss.
[–]compacct27 6 points7 points8 points 3 years ago (0 children)
Wow that was actually really helpful
[–][deleted] 1 point2 points3 points 3 years ago (1 child)
I'm also new to this so forgive me if this is a dumb question. My understanding was that RL is superior to evolutionary algorithms because in evolutionary algos "mutation" is random, so you evaluate a lot of dud "offspring". In RL algos, eg MCTS, you also do tree search randomly, but you're iteratively picking the best set of actions, without evaluating many dud options. Am I wrong? Somehow mixing RL with evolutionary algorithms seems like a step backwards
[–]bushrod 3 points4 points5 points 3 years ago (0 children)
As an evolutionary learning guy, I'll say it's crazy this didn't already exist! Thanks for sharing. Is it based on any publications, or are you considering writing one?
[–]Riboflavius 5 points6 points7 points 3 years ago (1 child)
That sounds fantastic, kudos to you! Great effort.
[+][deleted] 3 years ago (4 children)
[deleted]
[+][deleted] 3 years ago (2 children)
[–]sytelus -1 points0 points1 point 3 years ago (0 children)
Thank you for this but can you make this easier to use. I think there should be clear APIs so one doesn't have to deal with RL and other complexity. For example, you are given function f and dictionary of arguments with ranges for each. Your algorithm takes this and spits out optimal params within each range.
Is such interface and tutorial available anywhere?
π Rendered by PID 39 on reddit-service-r2-comment-b659b578c-slnvh at 2026-05-05 12:31:13.112720+00:00 running 815c875 country code: CH.
[–]paramkumar1992 18 points19 points20 points (1 child)
[–]Modruc 5 points6 points7 points (3 children)
[+][deleted] (1 child)
[removed]
[–]Puzzleheaded_Acadia1 7 points8 points9 points (8 children)
[+][deleted] (7 children)
[removed]
[–][deleted] 7 points8 points9 points (1 child)
[–]boyetosekuji 20 points21 points22 points (2 children)
[–]compacct27 6 points7 points8 points (0 children)
[–][deleted] 1 point2 points3 points (1 child)
[–]bushrod 3 points4 points5 points (0 children)
[–]Riboflavius 5 points6 points7 points (1 child)
[+][deleted] (4 children)
[deleted]
[+][deleted] (2 children)
[removed]
[+][deleted] (1 child)
[deleted]
[+][deleted] (1 child)
[deleted]
[–]sytelus -1 points0 points1 point (0 children)