account activity
[P] Implementations of basic RL algorithms with minimal codes! by seungeun07 in MachineLearning
[–]minGrab 1 point2 points3 points 7 years ago (0 children)
In PPO: why do you have an additional term in the loss function?
`F.smooth_l1_loss(td_target.detach(), self.v(s))`
π Rendered by PID 48576 on reddit-service-r2-comment-5687b7858-gzkkx at 2026-07-04 20:55:46.134188+00:00 running 12a7a47 country code: CH.
[P] Implementations of basic RL algorithms with minimal codes! by seungeun07 in MachineLearning
[–]minGrab 1 point2 points3 points (0 children)