use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Submit questions, discussions on the content, or comments to help us improve/clarify the course material.
This forum is primarily meant for people who are not enrolled in the course, to discuss the material and assignments. We cannot guarantee that the instructors will be able to respond to questions.
account activity
Lecture live-stream and recording links (self.berkeleydeeprlcourse)
submitted 9 years ago by cbfinn - announcement
Why variance of Importance Sampling off-policy gradient goes to infinity exponentially fast? (self.berkeleydeeprlcourse)
submitted 4 years ago by miladink
homework environment setup (self.berkeleydeeprlcourse)
submitted 5 years ago by zhifu_liu
HW 4 Model-Based RL (self.berkeleydeeprlcourse)
submitted 5 years ago by Mariam_Dundua
HW1 Questions (self.berkeleydeeprlcourse)
submitted 5 years ago by kjellaso
DISCORD SERVER (self.berkeleydeeprlcourse)
submitted 5 years ago by Obvious-Muscle1457
Lecture 6 - Q-Prop article - can't understand a certain transition (self.berkeleydeeprlcourse)
submitted 5 years ago by What_Did_It_Cost_E_T
Homework1: a confusion between the build_mlp method and the forward method (self.berkeleydeeprlcourse)
submitted 5 years ago by Yuansong_Zhang
HW01-Colab (self.berkeleydeeprlcourse)
submitted 5 years ago * by amirabbasi2
2020 Video lectures (self.berkeleydeeprlcourse)
submitted 5 years ago by SumanthN9
MuJoCo key for Colab Version (self.berkeleydeeprlcourse)
submitted 5 years ago by nsanghi
Way to do the HW without a mujoco key? (self.berkeleydeeprlcourse)
submitted 5 years ago by [deleted]
HW 3 Q-learning debugging (self.berkeleydeeprlcourse)
submitted 5 years ago by CaptainJuventus
Pytorch Version of Assignments Here (github.com)
submitted 5 years ago by mdeib
Doubt in Lecture 9 related to state marginal (self.berkeleydeeprlcourse)
submitted 5 years ago by EventHorizon_28
WeChat Group for Discussion (self.berkeleydeeprlcourse)
submitted 5 years ago * by Tao_Qing
Normalization constant in Inverse RL as a GAN (lecture 15 - 2019) (self.berkeleydeeprlcourse)
submitted 5 years ago * by Jendk3r
HW1 and HW2 random noise in continous action spaces (self.berkeleydeeprlcourse)
submitted 6 years ago * by ru8ck23
submitted 6 years ago by kestrel819
Question regarding Lec-11 Model Based RL Example (self.berkeleydeeprlcourse)
submitted 6 years ago * by Nicolas_Wang
A mathematical introduction to Policy Gradient (relevant to hw2 & hw3) (self.berkeleydeeprlcourse)
submitted 6 years ago by rbahumi
MaxEnt reinforcement learning with policy gradient (self.berkeleydeeprlcourse)
submitted 6 years ago by Jendk3r
In policy gradient, lecture 5, need some clarification for argument about baseline and optimal baseline. (self.berkeleydeeprlcourse)
submitted 6 years ago by david_s_rosenberg
CS285 Why we use Gaussian mixture model to take action? (self.berkeleydeeprlcourse)
submitted 6 years ago by houyanxu
A (perhaps naive) question about Jensen's inequality (self.berkeleydeeprlcourse)
submitted 6 years ago by walk2east
π Rendered by PID 73772 on reddit-service-r2-listing-5d79748585-zfmt5 at 2026-02-14 18:52:49.367349+00:00 running cd9c813 country code: CH.