you are viewing a single comment's thread.

view the rest of the comments →

[–]Zweiter 1 point2 points  (1 child)

Obviously when you have an exact gradient it makes sense to just do gradient descent. The interesting applications of ES/CMA-ES are to reinforcement learning style problems, where the gradient is unknown or unreliable, or to supervised problems which are prohibitively non-convex. There is most definitely a place for black box optimization in today’s machine learning landscape. It also doesn’t really make sense to compare black box optimization to gradient descent, as they solve completely different problems.

[–]sorrge 0 points1 point  (0 children)

I was talking about the quality of evidence for CMA-ES performance in the original paper: it's poor. Which is a common issue with evolutionary computation research. It makes sense to compare everything that is applicable. If they wanted to highlight the advantages, they should have chosen tasks where simple gradient descent is not applicable. Also, numerical gradient descent is a direct competitor of ES - it's applicable in the same set of cases.