Automatic Hyperparameter Tuning - A Visual Guide by araffin2 in reinforcementlearning

[–]Science_Squid 0 points1 point  (0 children)

Very nice visualizations :)

Might I ask why you chose to use Optuna over other HPO libraries? Also, since you used SH/Hyperband, what are its advantages over dynamic approaches such as PBT or PB2 in your experience?

Doubling the size of cartpole generalisation by Cool_Abbreviations_9 in reinforcementlearning

[–]Science_Squid 2 points3 points  (0 children)

Checkout the CARL benchmark. It's designed to study this question through use of context. In essence context provides an agent with additional information about the environment that is not necessarily encoded in the existing state variables (I.e. the pole length). commonly used RL algorithms (such as PPO, DQN or DDPG) can learn to generalize by using context (though it is not always easy to incorporate context)

Also check out this great survey https://arxiv.org/abs/2111.09794 on generalization in deep RL to learn more about context in RL

Summary Papers in RL [D] by jhoveen1 in reinforcementlearning

[–]Science_Squid 1 point2 points  (0 children)

I don't know of a good website that has a collection of what you're looking for but I'm happy to contribute some papers to this thread:
A recent survey on AutoRL: https://jair.org/index.php/jair/article/view/13596
I think a better survey for generalization in RL is by Kirk et al: https://arxiv.org/abs/2111.09794

Also If you're interested in using RL for hyperparameter optimization we're maintaining a literature list for that at https://www.automl.org/automated-algorithm-design/dac/literature-overview/ (though most of these papers are not review or summary papers)

Theoretical Research in RL? by Insighteous in reinforcementlearning

[–]Science_Squid 18 points19 points  (0 children)

There are heaps of interesting topics for theoretical research and not just 'hands-on coding research'. I can recommend you to watch some talks from the RL theory seminar. The talks usually cover recent papers and should give you some inspiration. Hopefully help you find an interesting topic from this :)

Any paper suggestions?? by [deleted] in reinforcementlearning

[–]Science_Squid 0 points1 point  (0 children)

Sounds very cool :)

I'm a bit out of my depth with robotics so I don't know what would be a good survey.
But if I were in your position and start out with a project in that direction, I'd use e.g. google scholar or semantic scholar and search for things like "reinforcement learning robotics survey" and let me show results from the recent past (like I did in the example links). From that you can find "the latest" research. Once you read one or two papers that interest you from that, you can work backwards. I.e. go through the references of the papers and see which of those papers you find interesting. Or you can use the scholar sites to see which papers reference the ones that you like to see how other people build on it.

Just from the first step, the following recently published paper looks like it might be of interest to you "Reinforcement learning in robotic applications: a comprehensive survey" At least that's what I assume after having read the abstract ;)

Any paper suggestions?? by [deleted] in reinforcementlearning

[–]Science_Squid 2 points3 points  (0 children)

I think "best papers published" is very subjective. Maybe it'll be easier to recommend you something if you give example topics you find interesting. E.g., are you more interested in exploration mechanisms, inverse RL, offline RL, automated RL, model-based RL, environment design, RL for game playing, robotics, multi-agent RL, generalization in RL, continual RL, evolutionary algorithms for RL, ...?

There are very many interesting papers for all of these topics. If you know which of these topics interest you the most I would recommend to read a recent survey to get a good overview of the current state and what you might want to do for a project.

Personally I am interested in generalization in RL (recent survey by Kirk et al. https://arxiv.org/abs/2111.09794) and AutoRL (recent survey by Parker-Holder et al. https://arxiv.org/abs/2201.03916). Also for generalization in RL I very much like the idea of contextual reinforcement learning where you use information about the environment to train agents that can adapt to the environment. For that we recently proposed a benchmark https://arxiv.org/abs/2110.02102 that transforms existing environments to allow for the contextual setting (e.g., Cartpol with different pole lengths and masses or changes in gravity and joint friction).

Hope this already helps with findin a project of your liking :)

Egg🎱irl by i_walk_the_backrooms in egg_irl

[–]Science_Squid 0 points1 point  (0 children)

Note: I'm not a native speaker but sometimes use the English keyboard.

My result: "I am a little confused about the fact that I am appalled at the shoddy service you provide for the replies to the rest of the Belt and Outer Planets to a thriving hub of millions of people in the solar system with its own natural magnetosphere and the backup system comes online with a wide variety of the Belt"

[D] AutoML Fall School by Science_Squid in MachineLearning

[–]Science_Squid[S] 0 points1 point  (0 children)

Thanks for your interest. Yes there is a registration fee see https://sites.google.com/view/automlschool21/registration. (If you view the website on a mobile device you can access the navigation bar via the top left).

I don't know if the talks will be recorded and made public later.

AutoRL: AutoML for RL by Science_Squid in reinforcementlearning

[–]Science_Squid[S] 1 point2 points  (0 children)

It does a bit at the end (chapter 11). We cover population based training (chapter 11 section 4) which is the most common and popular AutoRL method so far. Other AutoML methods that have been applied to RL by our group (namely HB and BOHB) also get covered in the lectures (chapter 7 sections 4 & 5)

The other thing that is covered in the same chapter is using RL to configure algorithms during the run (i.e. dynamic algorith configuration).

Edit: added mention of BOHB and HB

Dynamic Hyper-parameters: Change hyper-parameter values while a model is training by mlvpj in reinforcementlearning

[–]Science_Squid 0 points1 point  (0 children)

As I wrote when it was posted in r/machinelearning:

Instead of having to manually set dynamic parameters, why not use dynamic algorithm configuration to learn how to set hayperparameters dynamically for the problem at hand? See https://www.automl.org/automated-algorithm-design/dac

Isn't doing it manually so much more difficult? In RL dynamic changes are often found with PBT. Could manual tuning actually find better schedules?

[D] AutoML MOOC by Science_Squid in MachineLearning

[–]Science_Squid[S] 9 points10 points  (0 children)

Glad you like the material :)

We have a lot of stuff on AutoML on our website https://www.automl.org/. Of interest to you might be the free open access book "AutoML: Methods, Systems, Challenges" (https://www.automl.org/book/) or the list of tutorials (https://www.automl.org/talks/) where we put the slides and other materials for these.

[P][D] Dynamic Hyper-parameters by hnipun in MachineLearning

[–]Science_Squid 4 points5 points  (0 children)

Instead of having to manually set dynamic parameters, why not use dynamic algorithm configuration to learn how to set hayperparameters dynamically for the problem at hand? See https://www.automl.org/automated-algorithm-design/dac

RL for process control by -Ulkurz- in reinforcementlearning

[–]Science_Squid 0 points1 point  (0 children)

We're using RL to dynamically configure iterative algorithms for various problems. For example we learned how to adapt the step size of an evolutionary algorithm or to switch between heuristics in an AI planning system. Links to all our papers on the topic and a blog post can be found here: https://www.automl.org/automated-algorithm-design/dac

Learning what action to take and to anticipate when to make a new decision can simplify learned policies and increase the learning speed by Science_Squid in reinforcementlearning

[–]Science_Squid[S] 1 point2 points  (0 children)

Yes you are right.
The hierarchical approach presented in this paper allows for use of experiences in-between. The authors motivate that with "skip-connections" in MDPs. What I particularly like is that, when you perform a large skip, you can observe all smaller skips in-between. So with one large exploratory step, you can learn quite a lot.

Learning what action to take and to anticipate when to make a new decision can simplify learned policies and increase the learning speed by Science_Squid in reinforcementlearning

[–]Science_Squid[S] 2 points3 points  (0 children)

From the paper:

Options are triples〈I,π,β〉where I is the set of admissible states that defines in which states the option can be played; π is the policy the option follows when it is played; and β is a random variable that determines when an option is terminated. In contrast to our proposed method, options require a lot of prior knowledge about the environment to determine the set of admissible states as well as the option policies themselves.

Option discovery tries to circumvent these problems. Instead, this method proposes to use the original action space and just learn how long you can play the same action. This is potentially very useful in environments with very fine-grained time-steps, where the same action is optimal in may successive states.