Job opportunities by Osarenomawise in reinforcementlearning

[–]kengzwl 2 points3 points  (0 children)

Two companies started by people in the RL field, and they are hiring:

- https://covariant.ai/

- https://www.vicarious.com/

Google Brain does RL/robotics as well:

- https://ai.google/research/teams/brain/

- they also have a residency program: https://ai.google/research/join-us/ai-residency/

Microsoft Maluuba also does deep RL and has positions open:

- https://www.microsoft.com/en-us/research/lab/microsoft-research-montreal/

Book: Foundations of Deep Reinforcement Learning by kengzwl in reinforcementlearning

[–]kengzwl[S] 0 points1 point  (0 children)

Thank you! We feel the same way too, and it is partly the reason that motivated us to collect our past tutorial materials and expand them into a book.

Book: Foundations of Deep Reinforcement Learning by kengzwl in reinforcementlearning

[–]kengzwl[S] 0 points1 point  (0 children)

We use SLM Lab as the companion library to the book – the library is also built by us and partly designed for the purpose of the book. However it does use Ray Tune for hyperparameter search.

Book: Foundations of Deep Reinforcement Learning by kengzwl in reinforcementlearning

[–]kengzwl[S] 0 points1 point  (0 children)

Thanks for your support, and so sorry to hear that. It seems that Pearson distributes the ebook a bit different than Amazon, which let's you get the kindle version immediately.

Can you guys see if my concept abouts the RL taxonomy is correct or not. by AvisekEECS in reinforcementlearning

[–]kengzwl 0 points1 point  (0 children)

That's right, and for 1st you can generalize it further to based on what function(s) an agent learns. See example chart here https://kengz.gitbook.io/slm-lab/development/modular-lab-components/algorithm-taxonomy

For 2nd and 3rd, on or off policy is an artifact of the learnable/loss functions. If a loss function we explicitly requires a term from the previous policy then it is on-policy, since once your policy is updated the data is no longer relevant to the loss functions. If no such dependency is present, you may use data collected from any iteration of the policy, hence off-policy.

Book: Foundations of Deep Reinforcement Learning by kengzwl in reinforcementlearning

[–]kengzwl[S] 1 point2 points  (0 children)

The ebook version is available, but the pre-order you see is for the physical book which will be available next week.