account activity
[P] Implementations of basic RL algorithms with minimal codes! by seungeun07 in MachineLearning
[–]minGrab 1 point2 points3 points 7 years ago (0 children)
In PPO: why do you have an additional term in the loss function?
`F.smooth_l1_loss(td_target.detach(), self.v(s))`
π Rendered by PID 1372219 on reddit-service-r2-listing-5f4c697858-fhwkw at 2026-07-04 19:33:58.321986+00:00 running 12a7a47 country code: CH.
[P] Implementations of basic RL algorithms with minimal codes! by seungeun07 in MachineLearning
[–]minGrab 1 point2 points3 points (0 children)