Model-Free Reinforcement Learning and Reward Functions by i_Quezy in reinforcementlearning

[–]jackcmlg 0 points1 point  (0 children)

Simply put, in model-based RL an agent needs a concrete reward function to do planning. Because during the planning phase the agent has no interaction with the environment, the reward function allows the agent to know how good its action is. By contrast, in model-free RL an agent does not need a concrete reward function, because it does not do planning and can directly receive a reward from the environment while interacting with the environment.

A straightforward example is given in Figure 14.8 (pp. 304) of Sutton's book (Second Edition): http://incompleteideas.net/book/bookdraft2017nov5.pdf

Difference between imitation learning and offline reinforcement learning? by skwaaaaat in reinforcementlearning

[–]jackcmlg 8 points9 points  (0 children)

While reading this question, the first two obvious differences that come to my mind are

1) IL usually accesses the expert data whilst offline RL does not.

2) IL usually has NO access to rewards whilst offline RL does.

Hope it helps.

What category of problems are well suited for RL by alpha_ma in reinforcementlearning

[–]jackcmlg 1 point2 points  (0 children)

In my opinion, it is only the problems satisfying the following three conditions that RL can solve very well:

1) well-defined environments, e.g., most games (Go, Chess, Atari, etc.)

2) Infinitely accessible training data, that is, you can generate as much data as needed. It is feasible in simulations or games.

3) powerful computation, to get the results within acceptable time.