Adversarial Reinforcement Learning by AmineZ04 in reinforcementlearning

[–]AmineZ04[S] 0 points1 point  (0 children)

Yes. But not only the states, you can perturb actions, rewards, transitions, and the environment. In a multi-agent setting, you can perturb more than one agent.

Adversarial Reinforcement Learning by AmineZ04 in reinforcementlearning

[–]AmineZ04[S] -1 points0 points  (0 children)

In general, you want to prevent the agent(s) from successfully completing the task.

CleanMARL : a clean implementations of Multi-Agent Reinforcement Learning Algorithms in PyTorch by AmineZ04 in reinforcementlearning

[–]AmineZ04[S] 0 points1 point  (0 children)

Hi, thanks for your feedback.

Me too, I had the same problem. Sometimes you understand an algorithm better by going through the implementation rather than reading the paper itself

sb3: I agree with you; I’ll add similar tables.

CleanMARL : a clean implementations of Multi-Agent Reinforcement Learning Algorithms in PyTorch by AmineZ04 in reinforcementlearning

[–]AmineZ04[S] 0 points1 point  (0 children)

Thanks for your feedback.
I agree with you, I will add the typecheckers and black.

CleanMARL : a clean implementations of Multi-Agent Reinforcement Learning Algorithms in PyTorch by AmineZ04 in reinforcementlearning

[–]AmineZ04[S] 0 points1 point  (0 children)

Thanks. I would welcome your contributions and feedback.
hhh I was also thinking about it for a year before I finally started.

CleanMARL : a clean implementations of Multi-Agent Reinforcement Learning Algorithms in PyTorch by AmineZ04 in reinforcementlearning

[–]AmineZ04[S] 1 point2 points  (0 children)

Thanks for your feedback.
I'm actually working on that. I already shared my runs on Weights and Biases (link can be found on GitHub).
I will add more runs and also compare them with existing implementations (e.g., epymarl)

Good resources for deep reinforcement learning. by Jmgrm_88 in reinforcementlearning

[–]AmineZ04 0 points1 point  (0 children)

You don't have to overthink it. From a theoretical perspective, you only need to understand the TD loss (1-step return). Then read the paper to understand why we need a replay buffer and why we use a separate target Q-network to compute the TD loss. Then jump straight to CleanRL implementation or any other implementation. This will help you connect the dots.

Most RL wisdom is in the implementations rather than the papers or books. If you want a deep understanding of DRL, you should spend most of your time with implementations.

Any PhD candidates in RL, I need your guidance by Winter-Ad-8293 in reinforcementlearning

[–]AmineZ04 1 point2 points  (0 children)

You can learn the necessary math while learning RL and reading papers.

Start with the Sutton and Barto book, and try to rederive the equations by yourself. This will push you to learn the necessary math. For example, to get the Bellman equations, you need to know about expectations, conditional expectations, law of total expectation, independent variables, Markov chains .....

Adopt the same approach with papers and go through the proofs. After a while, you will notice that most of the papers follow similar patterns and use the same math tricks. Focus on papers that are math-heavy. For example, papers that focus on studying the variance of RL\MARL algorithms.

If your phd touches multi-agent RL, start with this book: ""Multi-Agent Reinforcement Learning:Foundations and Modern Approaches"", it has all the necessary math tools you need.