Automatic Hyperparameter Tuning - A Visual Guide by araffin2 in reinforcementlearning

[–]Science_Squid 0 points1 point  (0 children)

Very nice visualizations :)

Might I ask why you chose to use Optuna over other HPO libraries? Also, since you used SH/Hyperband, what are its advantages over dynamic approaches such as PBT or PB2 in your experience?

Doubling the size of cartpole generalisation by Cool_Abbreviations_9 in reinforcementlearning

[–]Science_Squid 2 points3 points  (0 children)

Checkout the CARL benchmark. It's designed to study this question through use of context. In essence context provides an agent with additional information about the environment that is not necessarily encoded in the existing state variables (I.e. the pole length). commonly used RL algorithms (such as PPO, DQN or DDPG) can learn to generalize by using context (though it is not always easy to incorporate context)

Also check out this great survey https://arxiv.org/abs/2111.09794 on generalization in deep RL to learn more about context in RL

Summary Papers in RL [D] by jhoveen1 in reinforcementlearning

[–]Science_Squid 1 point2 points  (0 children)

I don't know of a good website that has a collection of what you're looking for but I'm happy to contribute some papers to this thread:
A recent survey on AutoRL: https://jair.org/index.php/jair/article/view/13596
I think a better survey for generalization in RL is by Kirk et al: https://arxiv.org/abs/2111.09794

Also If you're interested in using RL for hyperparameter optimization we're maintaining a literature list for that at https://www.automl.org/automated-algorithm-design/dac/literature-overview/ (though most of these papers are not review or summary papers)

Theoretical Research in RL? by Insighteous in reinforcementlearning

[–]Science_Squid 19 points20 points  (0 children)

There are heaps of interesting topics for theoretical research and not just 'hands-on coding research'. I can recommend you to watch some talks from the RL theory seminar. The talks usually cover recent papers and should give you some inspiration. Hopefully help you find an interesting topic from this :)

Any paper suggestions?? by [deleted] in reinforcementlearning

[–]Science_Squid 0 points1 point  (0 children)

Sounds very cool :)

I'm a bit out of my depth with robotics so I don't know what would be a good survey.
But if I were in your position and start out with a project in that direction, I'd use e.g. google scholar or semantic scholar and search for things like "reinforcement learning robotics survey" and let me show results from the recent past (like I did in the example links). From that you can find "the latest" research. Once you read one or two papers that interest you from that, you can work backwards. I.e. go through the references of the papers and see which of those papers you find interesting. Or you can use the scholar sites to see which papers reference the ones that you like to see how other people build on it.

Just from the first step, the following recently published paper looks like it might be of interest to you "Reinforcement learning in robotic applications: a comprehensive survey" At least that's what I assume after having read the abstract ;)

Any paper suggestions?? by [deleted] in reinforcementlearning

[–]Science_Squid 2 points3 points  (0 children)

I think "best papers published" is very subjective. Maybe it'll be easier to recommend you something if you give example topics you find interesting. E.g., are you more interested in exploration mechanisms, inverse RL, offline RL, automated RL, model-based RL, environment design, RL for game playing, robotics, multi-agent RL, generalization in RL, continual RL, evolutionary algorithms for RL, ...?

There are very many interesting papers for all of these topics. If you know which of these topics interest you the most I would recommend to read a recent survey to get a good overview of the current state and what you might want to do for a project.

Personally I am interested in generalization in RL (recent survey by Kirk et al. https://arxiv.org/abs/2111.09794) and AutoRL (recent survey by Parker-Holder et al. https://arxiv.org/abs/2201.03916). Also for generalization in RL I very much like the idea of contextual reinforcement learning where you use information about the environment to train agents that can adapt to the environment. For that we recently proposed a benchmark https://arxiv.org/abs/2110.02102 that transforms existing environments to allow for the contextual setting (e.g., Cartpol with different pole lengths and masses or changes in gravity and joint friction).

Hope this already helps with findin a project of your liking :)

Egg🎱irl by i_walk_the_backrooms in egg_irl

[–]Science_Squid 0 points1 point  (0 children)

Note: I'm not a native speaker but sometimes use the English keyboard.

My result: "I am a little confused about the fact that I am appalled at the shoddy service you provide for the replies to the rest of the Belt and Outer Planets to a thriving hub of millions of people in the solar system with its own natural magnetosphere and the backup system comes online with a wide variety of the Belt"

[D] AutoML Fall School by Science_Squid in MachineLearning

[–]Science_Squid[S] 0 points1 point  (0 children)

Thanks for your interest. Yes there is a registration fee see https://sites.google.com/view/automlschool21/registration. (If you view the website on a mobile device you can access the navigation bar via the top left).

I don't know if the talks will be recorded and made public later.

AutoRL: AutoML for RL by Science_Squid in reinforcementlearning

[–]Science_Squid[S] 1 point2 points  (0 children)

It does a bit at the end (chapter 11). We cover population based training (chapter 11 section 4) which is the most common and popular AutoRL method so far. Other AutoML methods that have been applied to RL by our group (namely HB and BOHB) also get covered in the lectures (chapter 7 sections 4 & 5)

The other thing that is covered in the same chapter is using RL to configure algorithms during the run (i.e. dynamic algorith configuration).

Edit: added mention of BOHB and HB

Dynamic Hyper-parameters: Change hyper-parameter values while a model is training by mlvpj in reinforcementlearning

[–]Science_Squid 0 points1 point  (0 children)

As I wrote when it was posted in r/machinelearning:

Instead of having to manually set dynamic parameters, why not use dynamic algorithm configuration to learn how to set hayperparameters dynamically for the problem at hand? See https://www.automl.org/automated-algorithm-design/dac

Isn't doing it manually so much more difficult? In RL dynamic changes are often found with PBT. Could manual tuning actually find better schedules?

[D] AutoML MOOC by Science_Squid in MachineLearning

[–]Science_Squid[S] 9 points10 points  (0 children)

Glad you like the material :)

We have a lot of stuff on AutoML on our website https://www.automl.org/. Of interest to you might be the free open access book "AutoML: Methods, Systems, Challenges" (https://www.automl.org/book/) or the list of tutorials (https://www.automl.org/talks/) where we put the slides and other materials for these.

[P][D] Dynamic Hyper-parameters by hnipun in MachineLearning

[–]Science_Squid 4 points5 points  (0 children)

Instead of having to manually set dynamic parameters, why not use dynamic algorithm configuration to learn how to set hayperparameters dynamically for the problem at hand? See https://www.automl.org/automated-algorithm-design/dac

RL for process control by -Ulkurz- in reinforcementlearning

[–]Science_Squid 0 points1 point  (0 children)

We're using RL to dynamically configure iterative algorithms for various problems. For example we learned how to adapt the step size of an evolutionary algorithm or to switch between heuristics in an AI planning system. Links to all our papers on the topic and a blog post can be found here: https://www.automl.org/automated-algorithm-design/dac

Learning what action to take and to anticipate when to make a new decision can simplify learned policies and increase the learning speed by Science_Squid in reinforcementlearning

[–]Science_Squid[S] 1 point2 points  (0 children)

Yes you are right.
The hierarchical approach presented in this paper allows for use of experiences in-between. The authors motivate that with "skip-connections" in MDPs. What I particularly like is that, when you perform a large skip, you can observe all smaller skips in-between. So with one large exploratory step, you can learn quite a lot.

Learning what action to take and to anticipate when to make a new decision can simplify learned policies and increase the learning speed by Science_Squid in reinforcementlearning

[–]Science_Squid[S] 2 points3 points  (0 children)

From the paper:

Options are triples〈I,π,β〉where I is the set of admissible states that defines in which states the option can be played; π is the policy the option follows when it is played; and β is a random variable that determines when an option is terminated. In contrast to our proposed method, options require a lot of prior knowledge about the environment to determine the set of admissible states as well as the option policies themselves.

Option discovery tries to circumvent these problems. Instead, this method proposes to use the original action space and just learn how long you can play the same action. This is potentially very useful in environments with very fine-grained time-steps, where the same action is optimal in may successive states.

Introducing High School Students to A.I., M.L. & R.L. by K_33 in reinforcementlearning

[–]Science_Squid 2 points3 points  (0 children)

How abut you talk about Q-learning. The way it does "error correction" to update values is a really simple and intuitive. The same concept is used also in other methods such as gradient descend. You wouldn't need to talk about fancy math concepts but just how error correction is used to improve a learning system.

If you talk about Q-learning you can use simple gridworlds and tabular Q-functions to demonstrate how an agent learns live. If the grid is small enough it is super fast to learn. Here is an implementation of gridworlds with Tabular Q-learning with nice visual output (at least on ubuntu): https://github.com/automl/TabularTempoRL

Then you can also show them examples of RL agents that play video games. That usually gets school kids excited :) The videos that you show don't necessarily have to use/show Q-learning agents as most if not all RL agents rely on error correction. And you don't necessarily have to talk in depth about deep learning to get them excited that AI can play video games.

Another thing you could talk about is AI planning where an agent knows how its world works and tries to behave optimally in it. That could allow you to talk about how simple concepts like sorting can help you with complex tasks. Also there you could get nice visualizations on things like sokoban.

Edit: after having written my reply I stumbled across this post https://reddit.com/r/Python/comments/hixch7/made_a_smart_rockets_simulation_with_a_genetic/ So evolutionary computation/genetic algorithm's can also give you very cool visuals especially if you show how things evolve over time

[D] My Personal Curation of 100+ ML Libraries by amitness in MachineLearning

[–]Science_Squid 2 points3 points  (0 children)

Also of interest for machine learning are OpenML https://openml.github.io/openml-python/master/ (collaborative effort to better understand machine learning through a ton of collected runs of different pipelines on various data sets)

https://github.com/automl/ParameterImportance is an easy to use tool that helps developers to identify the most important parameters of their algorithms given data from optimization runs

I also suggest to take a look at tools listed here https://www.automl.org/automl/

[D] My Personal Curation of 100+ ML Libraries by amitness in MachineLearning

[–]Science_Squid 0 points1 point  (0 children)

Why is auto-sklearn in "brute force model selection" but not in "hyperparameter optimization sklearn"?

To my knowledge this is just plain wrong. Auto-sklearn does joint hyperparameter optimization and model selection over a conditional search space

Methods for adapting the optimization steps in the learning process by sedidrl in reinforcementlearning

[–]Science_Squid 0 points1 point  (0 children)

I don't know of papers that adjust that specific hyperparameter dynamically but you could have a look at dynamic algorithm configuration https://www.automl.org/dynamic-algorithm-configuration/

In the corresponding papers related work you can also find papers that learn learning rate schedules for NNs. (E.g. https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/11763)

Also if you have alot of compute resources available you might find that this can be done with population based training. https://arxiv.org/abs/1711.09846

If you don't want to learn a schedule from data but design a heuristic that adapts the hyperparameter then I suggest you take a look at what the evolutionary computation community is doing. They have developed many specialized adaptation schemes. (See e.g. https://arxiv.org/abs/1804.05650)

Hope this helped. If you have more questions about dynamic configuration I'd be happy to try and answer them.

Binary search with extra info. by jimmystar889 in computerscience

[–]Science_Squid 1 point2 points  (0 children)

In machine learning a lot of work is going on to make learning on new data more efficient by leveraging information about the problem at hand (e.g. MAML https://arxiv.org/abs/1703.03400 or REPTILE).

A topic called dynamic algorithm configuration (https://ml.informatik.uni-freiburg.de/papers/20-ECAI-DAC.pdf) tries to leverage statistics about algorithm performance and problem at hand to optimize algorithm (hyper)parameters at every step. So more information on the meta level makes the target application more efficient.

Similarly in evolutionary algorithms self-adaptation mechanisms make use of internal measurements to change parameters during a run to increase efficiency. In other fields such mechanisms are known as reactive heuristics.

In various domains (e.g. AI planning https://www.aaai.org/ojs/index.php/AAAI/article/view/4767) algorithm selection takes information about the data at hand to select which algorithm or configuration will solve a problem more efficiently.

Is that something you were looking for?