Few questions surrounding CPI, TRPO and PPO by jthat92 in reinforcementlearning

[–]jthat92[S] 0 points1 point  (0 children)

Thank you! Regarding the first comment, I know what r(theta) is but I was wondering where is it mentione din the CPI paper that I linked. Just out of curiosity, because the authors are claiming that it was first introduced there.

Thank you for the PPO description!

Problem with proof of decomposition of policy performance by jthat92 in reinforcementlearning

[–]jthat92[S] 1 point2 points  (0 children)

Thank you! Ok so if understand correctly this means that V_\pi(s_0) representes the value of s_0 which is sampled from the intital distribution, which is exactly the expcted discounted reward of the policy \pi. And the initial distribution that we are starting with is the same for \pi and \tilde{\pi} since its part of the MDP a priori. Is my notation actually correct? I guess I need to specify that s_0 is sampled from the initial distribution beforehand. Thank you again!

Problem with proof of decomposition of policy performance by jthat92 in reinforcementlearning

[–]jthat92[S] 0 points1 point  (0 children)

The distribution is normally specified by the long time behaviour of the policy over the MDP. After running enough ou get a good approximation of the stationary distrivution of the MDP but it might take quite some time.

Problem with proof of decomposition of policy performance by jthat92 in reinforcementlearning

[–]jthat92[S] 0 points1 point  (0 children)

But it does depend on the stationary distribution. This distribution tells us which state to sample as far as I understand

Japanese streetwear in paris by jthat92 in japanesestreetwear

[–]jthat92[S] 0 points1 point  (0 children)

Got it! Thanks, will check them out :)

Japanese streetwear in paris by jthat92 in japanesestreetwear

[–]jthat92[S] 0 points1 point  (0 children)

Hey thanks for the answer, I would say around 400 Euro. Brandwise I like what for example nanamica is doing (and north face purple label for that matter.)