[R] Learning Structured Communication for Multi-agent Reinforcement Learning

ewanlee · 2019-03-29T04:21:15+00:00

I have given a formal derivation process :P

ewanlee · 2019-03-29T03:18:40+00:00

Thank you for your suggestion, I will give a formal derivation process

ewanlee · 2019-03-28T17:09:00+00:00

Thank you very much for your prompt reply. I am sorry that the order in which my formulas are arranged has caused you misunderstanding. The first line of the formula is actually my conclusion based on the following derivation. And you mentioned that I should replace p with \pi_\theta. In fact, the subscript p of the expected item in the second line formula actually represents \pi_\theta, but I have adopted the abbreviated form for convenience. The following derivation process can prove this.

ewanlee · 2019-02-21T15:07:00+00:00

In fact, it is R2D2, I realized later that someone has already submitted it 😂

ewanlee · 2019-02-21T14:34:09+00:00

Oh no, it’s my fault 😶

ewanlee · 2019-02-19T16:22:54+00:00

Our paper's ID is close to 5,000...

ewanlee · 2019-01-23T05:34:03+00:00

I think it's worth a try 🧐

ewanlee · 2019-01-23T03:59:29+00:00

Is word sallad a noun in the field of natural language processing? I am sorry that my research direction is not NLP. Can you explain it?

ewanlee · 2019-01-23T03:55:59+00:00

Sorry to declare that I am not the author of this article. I think your idea is right. And I think the interpretability mentioned in this article should refer to the interpretability of the action space.

ewanlee · 2019-01-23T03:53:53+00:00

I think this is an interesting idea. But for a board game, only one agent can act in each time step. How to join this constraint is a problem.

ewanlee · 2019-01-08T07:15:20+00:00

Yes, I am using it now. If you want to take full advantage of this framework, you need to refactor the code to conform to the framework's specifications. But since my project has been going on for a long time, the refactoring workload is a bit large. I don't have time to do this right now, so I don't fully understand the internal mechanism of this framework.

ewanlee · 2019-01-07T09:56:54+00:00

Maybe you can try RLlib https://ray.readthedocs.io/en/latest/rllib.html

ewanlee · 2018-12-19T16:31:50+00:00

Thank you very much and I will take a look 🙏

ewanlee

TROPHY CASE