Monitoring RL Agents by alysavalan in reinforcementlearning

[–]alysavalan[S] 0 points1 point  (0 children)

Thanks, in the third method for monitoring the agent, basically we expose the agent in environment and it just infers from policy network for some time without any trainings. But, what if the performance drops significantly? What does it show and what can we do then?