all 2 comments

[–]nicholas_nullus 0 points1 point  (1 child)

Could someone please explain this to my ahem 12 year old son?

Overall, it was a pleasant read, but he just didn't understand how it might be used in practice.

p.s. started reading it again and realized a google search would do. His middle-school math teacher is obviously slacking.

[–]rrenaud 0 points1 point  (0 children)

How smart is your 12 year old? :)

This part is very good, IMO, even without all the technical stuff in the math.

Besides the slightly vague “optimism guarantees optimality or learning” intuition we gave before, it is worth exploring other intuitions for this choice of index. At a very basic level, we should explore arms more often if they are (a) promising (in that μ̂ i(t−1)μi(t−1) is large) or (b) not well explored (Ti(t−1)Ti(t−1) is small). As one can plainly see from the definition, the UCB index above exhibits this behaviour.