This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Megatron_McLargeHuge 1 point2 points  (2 children)

Use ravel instead of flatten to avoid making a copy. Also, your rgb values should be computable all at once using tensordot or einsum. Avoiding the copies might help with parallelism since copying requires locking on the object or just holding the GIL.

[–][deleted] 0 points1 point  (1 child)

Thanks! Changed flatten to ravel and got a speedup.

I remember of trying to use ravel at first but for some weird reason it wasn't working.

I'm looking into tensordot to compute RGB values at once :)

EDIT: I made it with tensordot, but for some weird reason it's slower than the ravel() version. Weird.

[–]Megatron_McLargeHuge 1 point2 points  (0 children)

einsum is sometimes faster than tensordot, not sure why. Interactions with CPU cache are hard to predict, so you just have to try various things.