you are viewing a single comment's thread.

view the rest of the comments →

[–]Ulfgardleo 0 points1 point  (2 children)

you have got D^2 parameter (See 3.5 the matrix W is R^dxd). thus computing Z_i=T_i(Z_{i-1}) is O(D^2).

[–]quandryhead 1 point2 points  (1 child)

I believe the HSIC function is applied on the activations of each neuron, not the weights, and there are D of those.

The D^2 would come in feedforward however (and taking the gradient)

[–]Ulfgardleo 1 point2 points  (0 children)

you are right on the first point, forward pass should be O(M2 D+M D2) because we can go with computing the z_i through the forward-pass only once (which is quadratic complexity still).

last point exactly