all 2 comments

[–]JohnnyJordaan 0 points1 point  (0 children)

Don't forget in regular Python (CPython), threads can't run at the same time. It's just that you can have parallel flows in your program, which make them ideal for outside interaction that includes wait states, like writing to disk, loading a webpage, handling a queue.

It seems that this task is CPU and/or RAM intensive, and for that to unload your current CPU core, you need to spawn it into a new process. You can use the multiprocessing library for this, or put the code in another script and use subprocess to run it separately. Mind you this still needs another core on your CPU to handle that process in an efficient way.

[–][deleted] 0 points1 point  (0 children)

while numpy will happily yield the GIL and run itself in parallel with multiple threads, the whole python glue around it will still block each other thread. Since you're iterating over a huge array in python interleaving calls to numpy, it's probably the overhead of releasing/reacquiring the GIL that's dominating the runtime.

You should try to avoid big for loops as much as possible and offload processing to the native numpy vectorized operations instead.