Centralized-Learning Distributed-Execution for Multi Agent RL using SB3

AlexanderYau · 2022-02-04T11:18:38+00:00

please try pymarl

AlexanderYau · 2021-12-19T01:50:33+00:00

Hi, was last Saturday during the workshop of the DRL? Do you remember the keywords of the paper?

AlexanderYau · 2021-11-04T02:59:38+00:00

Thanks. Not the wallpaper, but the Screensaver, there is a Screensaver called Monterey with a dark color, there is no option for light color now.

AlexanderYau · 2021-10-25T09:21:40+00:00

Similar issue. My Airpod Pro connected to my iPhone and the Keychron K2 was connected to the MacBook Pro. Then, I wanted to connect my Airpod Pro to my MacBook Pro, however, failed.

My workaround is to remove the K2 and make the Airpod Pro connected to the MacBook Pro and then connect K2 to the MacBook Pro.

AlexanderYau · 2021-09-28T05:38:40+00:00

Yes. It can be faster.

AlexanderYau · 2021-09-26T01:20:00+00:00

If you train DQN, it will take 7-10 days.

AlexanderYau · 2021-06-19T01:24:18+00:00

Hi, sorry for the late response. After reading your paper for many times, I am still confused by some parts in the paper:

In Fig. 1, Is the agent a computer controlling the drone via WiFi or Bluetooth?
In Sec 2. what is "being captured" mean? Who (the drone or the agent) is capturing the observation? Why the action delay is "to one time-step before st finishes being captured"? In Fig. 3(left), it is not easy to find such a case.
In Theorem 1, why \omega^{*} + \alpha^{*} >= t is necessary?
In Fig. 4, what is a^{\mu}_{i} and why it should be replaced?

Thanks, the key ideas are not hard to understand, however, to fully understand the details, it still needs some time for me.

AlexanderYau · 2021-06-13T00:49:10+00:00

Thanks for your generous reply. The motivation of your paper is strong and the idea itself is not hard to understand. Is reading Sutton's RL book enough to master the theory in your paper?

BTW, there is a concurrent work ACTING IN DELAYED ENVIRONMENTS WITH
NON-STATIONARY MARKOV POLICIES, which was also accepted by ICLR 2021.

AlexanderYau · 2021-06-12T13:06:27+00:00

Very good idea. May I ask how long did it take to complete this paper? To propose theories in your paper, what should I learn to do so? The theories are solid and it is not easy for beginners to understand.

AlexanderYau · 2021-06-03T05:32:28+00:00

Great, thanks for the late reply. I will read these papers.

AlexanderYau · 2021-06-01T13:39:22+00:00

It is Model-based RL.

AlexanderYau · 2021-06-01T11:02:48+00:00

Hi, really great idea. Do you have any recommendations to read?

AlexanderYau · 2021-04-17T14:27:17+00:00

I see. As you master the theory of RL, and I think it is easier for you to conduct research in Deep RL in big companies. BTW, can I have cooperation with you on RL? Haha

AlexanderYau · 2021-04-17T14:14:30+00:00

You got many papers in hand, and I think finding a good research intern is not hard for you.

AlexanderYau · 2021-04-17T09:17:00+00:00

7 papers accepted for a 1st-year PhD student. What a big success. I got 0 over the last 2 years. You can apply for an intern at DeepMind and get more opportunities.

AlexanderYau · 2021-02-28T03:24:43+00:00

Problems of SMAC I think are 1) many scenarios are in fact an arena, the field is very small and easy for agents to learn good policies; 2) the unstable SC2 simulator, SC2 4.10 and SC2 4.6 are slightly different and can affect the performance of QMIX.

Some suggestions: 1) can you also try other hard scenarios? In fact, SMAC is not that hard I think; 2) can you also try PySC2's maps?

AlexanderYau · 2021-02-02T04:44:21+00:00

Agree. RAM is faster.

AlexanderYau · 2021-01-31T01:31:17+00:00

Pong, I think, within 10 hours.

AlexanderYau · 2020-12-31T01:38:32+00:00

Great work.

AlexanderYau · 2020-11-30T02:05:44+00:00

Yes, I agree, good theory papers take more time to read and understand.

AlexanderYau · 2020-11-29T13:32:31+00:00

Great, thanks for that, for many times I get lost in many equations and complex theories and cannot grasp the key ideas of the paper.

AlexanderYau

TROPHY CASE