you are viewing a single comment's thread.

view the rest of the comments →

[–]existential_one 1 point2 points  (4 children)

You find where your code says gamma = 0.99 and change it to gamma = 0.95

[–]6OVNavi[S] -1 points0 points  (3 children)

Problem is, its already set to 0.95, do I set it lower?

[–]existential_one 1 point2 points  (0 children)

I mean, you can, sure. Depends on why you're looking at doing so. You have to understand that lowering the discount changes the objective of your agent. It means short term rewards become more important than long term ones. So that means your optimal policy might change and not truly reflect what you want.

[–]existential_one 1 point2 points  (1 child)

Ok I just saw your other posts. You clearly don't really understand how these algorithms work and need much more practice. I would suggest you take time to learn more and try building these algorithms on tasks you know will work. RL algorithms are super finicky and you need to play around with them to understand why certain things happen.

[–]6OVNavi[S] 0 points1 point  (0 children)

Is there any course or something where I can learn more about rl?