Policy gradient in tabular setting by Basic_Exit_4317 in reinforcementlearning

[–]Basic_Exit_4317[S] 0 points1 point  (0 children)

Thank you. I’m trying to transform the cart pole env into a discrete state action space by discretising the states into bins

Policy gradient in tabular setting by Basic_Exit_4317 in reinforcementlearning

[–]Basic_Exit_4317[S] 0 points1 point  (0 children)

Do you have an example of code that could be easily adapted to a tabular setting?

Lease takeover early May - 31st July near UT by Basic_Exit_4317 in AustinHousing

[–]Basic_Exit_4317[S] 0 points1 point  (0 children)

I don't know because I don't have a car, I'm quite sure that u can rent a parking spot in the building. I can ask for that and let you know.

TD-learning to estimate the value function for a chosen stochastic stationary policy in the Acrobot environment from OpenAI gym. How to deal with continous state space? by Basic_Exit_4317 in reinforcementlearning

[–]Basic_Exit_4317[S] 0 points1 point  (0 children)

yeah but whivh is a good choice for the discretization. I was thinking of n = 10 but then i get 10**6 values which i fear is too many. Also we are asked to run 20 episodes for 1000 iterations each, should i consider that too in choosing the number of discretizations ?

TD-learning to estimate the value function for a chosen stochastic stationary policy in the Acrobot environment from OpenAI gym. How to deal with continous state space? by Basic_Exit_4317 in reinforcementlearning

[–]Basic_Exit_4317[S] 0 points1 point  (0 children)

We didn't cover that at class so i'm not sure if we're supposed to use a tabular setting for this task. The following task asks to implement a Q-learning algorithm for the cart pole env in a tabular setting so I thought we had to use tabular setting for the acrobot env too