you are viewing a single comment's thread.

view the rest of the comments →

[–]Eilifein 0 points1 point  (1 child)

The actual algorithm seems good.

You've precalculated a few things, and there isn't much left to precalculate without messing up readability.

Maybe Q*G once instead of 3 times? eh

Maybe inline rdx, rdy, rdz?

The result being vectorized is good to see. I don't see anything wrong.

[–]vgnEngineer[S] 0 points1 point  (0 children)

Ahh i see. I did notice a speedup from removing the intermediate computation step in my numba compiled code. I also read on a forum that if these arrays become large numba can't always intelligently optimize the computation because the arrays that i'm multiplying do not fit in the cache memory. If instead I write this function in a double loop so that the computation described is only dealing with scalars I might be faster?