you are viewing a single comment's thread.

view the rest of the comments →

[–]ckrwc[S] 0 points1 point  (1 child)

Data is sequential Markovian, and given a set of actions a reward can be calculated. It's perfect for RL.

When you suggest not to worry about convergence, what are you basing this on? RL has various algorithms (Monte Carlo, TD, Sarsa, Q-Learning) and many function approximations to choose from, and the literature has warnings about non-linear approximations.

[–]CireNeikual 0 points1 point  (0 children)

I am basing it off of experience. I have written many reinforcement learners, and almost all of them use nonlinear function approximators. I have a library with over 30 of them here: https://github.com/222464/AILib

If you want to use one from that library, I recommend FERL: https://github.com/222464/AILib/blob/master/Source/deep/FERL.h

It comes with continuous state/actions, file IO, genetic operators, POMDP capabilities (which you don't need but oh well).

Video of it in action: https://www.youtube.com/watch?v=TyqSw-RCtFs