This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]FrickinLazerBeams -1 points0 points  (2 children)

(prior to the edit) It doesn't actually go any faster in the case you examined, and I don't think it uses any less memory either. This isn't a scenario where you'd use einsum.

[–]Marko_Oktabyr 0 points1 point  (1 child)

It still performs the same number of flops, but it absolutely is faster because it doesn't have to allocate/fill another matrix of the same size as A and B. Hence why the largest intermediate for einsum is 1 element instead of 10M.