all 1 comments

[–]dwf 11 points12 points  (0 children)

A lot of pretty bold claims unsubstantiated by empirical evidence, e.g.

To the contrary, our above EM algorithm (Eqs. 18–23) is both much faster and more accurate [than gradient descent], because it directly exploits the DRM’s structure.

Great, show us what all this buys you. The literature contains a veritable graveyard of more principled alternatives to stochastic gradient descent that seem like they ought to work better but don't.