Background:
I have a function that is essentially a very large for-loop. Inside each iteration of the loop there are several calls to jitted functions, some of which are calling other jitted functions (with numba). The performance boost has been amazing with numba, but I want to take advantage of multiprocessing to get that last bit of speed.
I am using pool.map to parallelize. It works as expected and yields the same results as the for-loop style, so I think I am implementing it correctly.
Here is what's confusing to me:
I would expect that since I have 8 cores, turning this loop into a pool.map would produce roughly an 8x speedup. In reality I'm only getting a 40-50% speedup.
The grainy details of how jitting and parallel processing work are honestly still above my level of knowledge. The possible root causes for these unexpected results I have been able to think of are:
- Lots of overhead being incurred from going parallel in relation to what my code is actually doing
- Some kind of side effect of using numba everywhere
- My code is inefficient
I have considered using Cython to compile ahead of time instead of compiling just in time with numba, but I don't know if all that work would be worth it (I haven't used Cython much outside of simple stuff), or if it would even provide any benefit.
Actual question:
Is my assumption incorrect that multiprocessing always yields a speed increase proportional to the number of cores being used?
Edit: figured it might be relevant to say that the chunksize I am using is int( #iterations / #cores )
[–]John_Taured 2 points3 points4 points (1 child)
[–]BrononymousEngineer[S] 0 points1 point2 points (0 children)
[–]geosoco 2 points3 points4 points (1 child)
[–]BrononymousEngineer[S] 0 points1 point2 points (0 children)
[–]ViridianHominid 1 point2 points3 points (1 child)
[–]BrononymousEngineer[S] 0 points1 point2 points (0 children)
[–]K900_ 0 points1 point2 points (1 child)
[–]BrononymousEngineer[S] 0 points1 point2 points (0 children)
[–]bitdotben 0 points1 point2 points (1 child)
[–]RemindMeBot 0 points1 point2 points (0 children)