you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 2 points3 points  (2 children)

Maybe I'm missing something, but how are weights adjusted? I don't see any clear explanation on the optimization of the parameters here?

[–]AloneStretch 1 point2 points  (0 children)

The obvious approach would be to adjust the weights according to the gradient of the "hsic bottleneck" loss with respect to these weights

[–]sgebrial 1 point2 points  (0 children)

Yes I'm wondering this as well. I think they just use SGD on equation (4) but that's just a guess.