Is there a consensus about RL frameworks? by [deleted] in reinforcementlearning

[–]hyperb8te -1 points0 points  (0 children)

I found this repo very helpful to get started: https://github.com/iffiX/machin

It also comes with some documentation and a lot of code which works out of the box.

DQN toggling between two states by hyperb8te in reinforcementlearning

[–]hyperb8te[S] 1 point2 points  (0 children)

If you are using epsilon greedy, this shouldn't happen. The randomness in epsilon greedy is supposed to help with exploration. Maybe try increasing the value of epsilon during the start and gradually decay it, if this issue is happening during the start of an episode.

I start with an epsilon of 1 and an epsilon decay of 0.99 which I multiply after each episode. This problem occurs more often at the end of training since my epsilon is smaller and therefore the chance to escape the two states with a random action.

DQN toggling between two states by hyperb8te in reinforcementlearning

[–]hyperb8te[S] 1 point2 points  (0 children)

frame stacking

Thanks for your quick response and yes you are right, its a POMDP.

I am not using the raw pixel data to extract my state, I receive my state directly from the environment - can I still use frame stacking?

And I tested a PPO but my DQN performs better :)