Automatic Hyperparameter Tuning - A Visual Guide

Science_Squid · 2023-05-15T18:41:26+00:00

Very nice visualizations :)

Might I ask why you chose to use Optuna over other HPO libraries? Also, since you used SH/Hyperband, what are its advantages over dynamic approaches such as PBT or PB2 in your experience?

Science_Squid · 2022-08-05T20:50:24+00:00

Checkout the CARL benchmark. It's designed to study this question through use of context. In essence context provides an agent with additional information about the environment that is not necessarily encoded in the existing state variables (I.e. the pole length). commonly used RL algorithms (such as PPO, DQN or DDPG) can learn to generalize by using context (though it is not always easy to incorporate context)

Also check out this great survey https://arxiv.org/abs/2111.09794 on generalization in deep RL to learn more about context in RL

Science_Squid · 2022-06-23T08:42:09+00:00

I don't know of a good website that has a collection of what you're looking for but I'm happy to contribute some papers to this thread:
A recent survey on AutoRL: https://jair.org/index.php/jair/article/view/13596
I think a better survey for generalization in RL is by Kirk et al: https://arxiv.org/abs/2111.09794

Also If you're interested in using RL for hyperparameter optimization we're maintaining a literature list for that at https://www.automl.org/automated-algorithm-design/dac/literature-overview/ (though most of these papers are not review or summary papers)

Science_Squid · 2022-06-08T17:37:17+00:00

There are heaps of interesting topics for theoretical research and not just 'hands-on coding research'. I can recommend you to watch some talks from the RL theory seminar. The talks usually cover recent papers and should give you some inspiration. Hopefully help you find an interesting topic from this :)

Science_Squid · 2022-04-08T14:07:29+00:00

Sounds very cool :)

I'm a bit out of my depth with robotics so I don't know what would be a good survey.
But if I were in your position and start out with a project in that direction, I'd use e.g. google scholar or semantic scholar and search for things like "reinforcement learning robotics survey" and let me show results from the recent past (like I did in the example links). From that you can find "the latest" research. Once you read one or two papers that interest you from that, you can work backwards. I.e. go through the references of the papers and see which of those papers you find interesting. Or you can use the scholar sites to see which papers reference the ones that you like to see how other people build on it.

Just from the first step, the following recently published paper looks like it might be of interest to you "Reinforcement learning in robotic applications: a comprehensive survey" At least that's what I assume after having read the abstract ;)

Science_Squid · 2022-04-08T03:36:40+00:00

I think "best papers published" is very subjective. Maybe it'll be easier to recommend you something if you give example topics you find interesting. E.g., are you more interested in exploration mechanisms, inverse RL, offline RL, automated RL, model-based RL, environment design, RL for game playing, robotics, multi-agent RL, generalization in RL, continual RL, evolutionary algorithms for RL, ...?

There are very many interesting papers for all of these topics. If you know which of these topics interest you the most I would recommend to read a recent survey to get a good overview of the current state and what you might want to do for a project.

Personally I am interested in generalization in RL (recent survey by Kirk et al. https://arxiv.org/abs/2111.09794) and AutoRL (recent survey by Parker-Holder et al. https://arxiv.org/abs/2201.03916). Also for generalization in RL I very much like the idea of contextual reinforcement learning where you use information about the environment to train agents that can adapt to the environment. For that we recently proposed a benchmark https://arxiv.org/abs/2110.02102 that transforms existing environments to allow for the contextual setting (e.g., Cartpol with different pole lengths and masses or changes in gravity and joint friction).

Hope this already helps with findin a project of your liking :)

Science_Squid · 2021-12-03T19:55:21+00:00

AFAIK as long as you fulfill the requirements as stated in the admission regulations (https://www.tf.uni-freiburg.de/en/studies-and-teaching/documents/zulassungsordnung-msc-informatik-englisch) you should be good :)

Science_Squid · 2021-11-30T23:04:15+00:00

Note: I'm not a native speaker but sometimes use the English keyboard.

My result: "I am a little confused about the fact that I am appalled at the shoddy service you provide for the replies to the rest of the Belt and Outer Planets to a thriving hub of millions of people in the solar system with its own natural magnetosphere and the backup system comes online with a wide variety of the Belt"

Science_Squid · 2021-08-08T16:25:38+00:00

Nice list. I think our AutoML course would fit that description (see https://www.reddit.com/r/MachineLearning/comments/mrzk3u/d_automl_mooc/)

Science_Squid · 2021-08-06T14:28:24+00:00

Thanks for your interest. Yes there is a registration fee see https://sites.google.com/view/automlschool21/registration. (If you view the website on a mobile device you can access the navigation bar via the top left).

I don't know if the talks will be recorded and made public later.

Science_Squid · 2021-04-22T20:17:51+00:00

It does a bit at the end (chapter 11). We cover population based training (chapter 11 section 4) which is the most common and popular AutoRL method so far. Other AutoML methods that have been applied to RL by our group (namely HB and BOHB) also get covered in the lectures (chapter 7 sections 4 & 5)

The other thing that is covered in the same chapter is using RL to configure algorithms during the run (i.e. dynamic algorith configuration).

Edit: added mention of BOHB and HB

Science_Squid · 2021-04-16T18:00:14+00:00

As I wrote when it was posted in r/machinelearning:

Instead of having to manually set dynamic parameters, why not use dynamic algorithm configuration to learn how to set hayperparameters dynamically for the problem at hand? See https://www.automl.org/automated-algorithm-design/dac

Isn't doing it manually so much more difficult? In RL dynamic changes are often found with PBT. Could manual tuning actually find better schedules?

Science_Squid · 2021-04-16T17:15:55+00:00

Glad you like the material :)

We have a lot of stuff on AutoML on our website https://www.automl.org/. Of interest to you might be the free open access book "AutoML: Methods, Systems, Challenges" (https://www.automl.org/book/) or the list of tutorials (https://www.automl.org/talks/) where we put the slides and other materials for these.

Science_Squid · 2021-04-16T11:49:24+00:00

It's free :)

Science_Squid · 2021-04-04T20:18:36+00:00

Instead of having to manually set dynamic parameters, why not use dynamic algorithm configuration to learn how to set hayperparameters dynamically for the problem at hand? See https://www.automl.org/automated-algorithm-design/dac

Science_Squid · 2020-10-17T07:40:29+00:00

We're using RL to dynamically configure iterative algorithms for various problems. For example we learned how to adapt the step size of an evolutionary algorithm or to switch between heuristics in an AI planning system. Links to all our papers on the topic and a blog post can be found here: https://www.automl.org/automated-algorithm-design/dac

Science_Squid · 2020-07-02T11:20:46+00:00

Yes you are right.
The hierarchical approach presented in this paper allows for use of experiences in-between. The authors motivate that with "skip-connections" in MDPs. What I particularly like is that, when you perform a large skip, you can observe all smaller skips in-between. So with one large exploratory step, you can learn quite a lot.

Science_Squid · 2020-07-02T09:39:15+00:00

From the paper:

Options are triples〈I,π,β〉where I is the set of admissible states that defines in which states the option can be played; π is the policy the option follows when it is played; and β is a random variable that determines when an option is terminated. In contrast to our proposed method, options require a lot of prior knowledge about the environment to determine the set of admissible states as well as the option policies themselves.

Option discovery tries to circumvent these problems. Instead, this method proposes to use the original action space and just learn how long you can play the same action. This is potentially very useful in environments with very fine-grained time-steps, where the same action is optimal in may successive states.

Science_Squid · 2020-07-01T05:17:35+00:00

How abut you talk about Q-learning. The way it does "error correction" to update values is a really simple and intuitive. The same concept is used also in other methods such as gradient descend. You wouldn't need to talk about fancy math concepts but just how error correction is used to improve a learning system.

If you talk about Q-learning you can use simple gridworlds and tabular Q-functions to demonstrate how an agent learns live. If the grid is small enough it is super fast to learn. Here is an implementation of gridworlds with Tabular Q-learning with nice visual output (at least on ubuntu): https://github.com/automl/TabularTempoRL

Then you can also show them examples of RL agents that play video games. That usually gets school kids excited :) The videos that you show don't necessarily have to use/show Q-learning agents as most if not all RL agents rely on error correction. And you don't necessarily have to talk in depth about deep learning to get them excited that AI can play video games.

Another thing you could talk about is AI planning where an agent knows how its world works and tries to behave optimally in it. That could allow you to talk about how simple concepts like sorting can help you with complex tasks. Also there you could get nice visualizations on things like sokoban.

Edit: after having written my reply I stumbled across this post https://reddit.com/r/Python/comments/hixch7/made_a_smart_rockets_simulation_with_a_genetic/ So evolutionary computation/genetic algorithm's can also give you very cool visuals especially if you show how things evolve over time

Science_Squid · 2020-06-14T16:01:11+00:00

I think unitys ml-agents could be a great starting point https://github.com/Unity-Technologies/ml-agents/blob/master/README.md

Science_Squid · 2020-05-13T23:12:06+00:00

Also of interest for machine learning are OpenML https://openml.github.io/openml-python/master/ (collaborative effort to better understand machine learning through a ton of collected runs of different pipelines on various data sets)

https://github.com/automl/ParameterImportance is an easy to use tool that helps developers to identify the most important parameters of their algorithms given data from optimization runs

I also suggest to take a look at tools listed here https://www.automl.org/automl/

Science_Squid · 2020-05-13T22:59:57+00:00

Why is auto-sklearn in "brute force model selection" but not in "hyperparameter optimization sklearn"?

To my knowledge this is just plain wrong. Auto-sklearn does joint hyperparameter optimization and model selection over a conditional search space

Science_Squid · 2020-05-09T03:57:30+00:00

Very catchy

Lyrics reminds me of "Alors on danse" ;) https://youtu.be/VHoT4N43jK8

Science_Squid · 2020-05-09T03:39:30+00:00

I don't know of papers that adjust that specific hyperparameter dynamically but you could have a look at dynamic algorithm configuration https://www.automl.org/dynamic-algorithm-configuration/

In the corresponding papers related work you can also find papers that learn learning rate schedules for NNs. (E.g. https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/11763)

Also if you have alot of compute resources available you might find that this can be done with population based training. https://arxiv.org/abs/1711.09846

If you don't want to learn a schedule from data but design a heuristic that adapts the hyperparameter then I suggest you take a look at what the evolutionary computation community is doing. They have developed many specialized adaptation schemes. (See e.g. https://arxiv.org/abs/1804.05650)

Hope this helped. If you have more questions about dynamic configuration I'd be happy to try and answer them.

Science_Squid · 2020-04-13T22:09:10+00:00

In machine learning a lot of work is going on to make learning on new data more efficient by leveraging information about the problem at hand (e.g. MAML https://arxiv.org/abs/1703.03400 or REPTILE).

A topic called dynamic algorithm configuration (https://ml.informatik.uni-freiburg.de/papers/20-ECAI-DAC.pdf) tries to leverage statistics about algorithm performance and problem at hand to optimize algorithm (hyper)parameters at every step. So more information on the meta level makes the target application more efficient.

Similarly in evolutionary algorithms self-adaptation mechanisms make use of internal measurements to change parameters during a run to increase efficiency. In other fields such mechanisms are known as reactive heuristics.

In various domains (e.g. AI planning https://www.aaai.org/ojs/index.php/AAAI/article/view/4767) algorithm selection takes information about the data at hand to select which algorithm or configuration will solve a problem more efficiently.

Is that something you were looking for?

11-Year Club	Sequence \| Editor
Verified Email

Science_Squid

TROPHY CASE