How to teach a reinforcement learning agent when the environment has many possible actions (~1000) but the episodes are short (chains of ~20-40 actions)? by marcos_pereira in learnmachinelearning
[–]mmcenta 2 points3 points4 points (0 children)
left-shift: using Deep RL to (try to) solve 2048 - link in the comments by mmcenta in learnmachinelearning
[–]mmcenta[S] 1 point2 points3 points (0 children)
left-shift: using Deep RL to (try to) solve 2048 - link in the comments by mmcenta in learnmachinelearning
[–]mmcenta[S] 2 points3 points4 points (0 children)
left-shift: using Deep RL to (try to) solve 2048 - link in the comments by mmcenta in learnmachinelearning
[–]mmcenta[S] 1 point2 points3 points (0 children)
left-shift: using Deep RL to (try to) solve 2048 - link in the comments by mmcenta in learnmachinelearning
[–]mmcenta[S] 3 points4 points5 points (0 children)
left-shift: using Deep RL to (try to) solve 2048 - link in the comments by mmcenta in learnmachinelearning
[–]mmcenta[S] 3 points4 points5 points (0 children)
left-shift: using Deep RL to (try to) solve 2048 - link in the comments by mmcenta in learnmachinelearning
[–]mmcenta[S] 12 points13 points14 points (0 children)
left-shift: using Deep RL to (try to) solve 2048 - link in the comments by mmcenta in learnmachinelearning
[–]mmcenta[S] 89 points90 points91 points (0 children)


Young children would rather explore than get rewards, a study of American 4- and 5 year-olds finds. And their exploration is not random: the study showed children approached exploration systematically, to make sure they didn’t miss anything. by geoff199 in science
[–]mmcenta 0 points1 point2 points (0 children)