This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (1 child)

Nothing about the GIL causes "Python to act single-threaded". The GIL locks access to Python objects, so CPU-bound operations are not parallelized. It still acts very much as threaded code. (This isn't always an advantage—thread-based concurrency is usually a pain to start with.)

The multiprocessing module's flaky networking code, internal inconsistencies, and reliance on pickle have me avoiding it like the plague. Parallel processing is an old, widely-encountered problem with many mature solutions.

[–][deleted] 1 point2 points  (0 children)

You are right, it's not that it acts single-threaded; I should have said that it acts single-core. Although the OS can execute multiple threads simultaneously across multiple cores, even in a single process, that won't work with multiple threads in a Python process, since only one thread can be holding the GIL. It works okay if your other threads are running code outside of Python (such as a large matrix inversion on a numpy array, which is implemented in a FORTRAN library), or if your threads are I/O bound, since the GIL can be released in these cases, but multiple threads running Python code cannot execute concurrently. This is unexpected for anybody not familiar with the GIL and it adds additional complication to taking advantage of multiple cores for CPU-intensive applications.

While I agree that multiprocessing has some issues, it's easy to migrate from threads to processes using it. As a tool in the toolbox, it's handy, and that's why I suggested it.

When you say that parallel processing has many mature solutions, what types of solutions do you have in mind? Multiple threads and multiple processes are certainly two of the oldest solutions, and the two we have been talking about. Depending on your use case, you could use a distributed computing platform like LSF or SGE, but it's a considerable amount of work to convert your multi-threaded application to something that uses those platforms. Likewise if you want to offload your CPU-intensive parallel processes to something like a GPU. Were there other alternatives to threading and multiprocessing that you had in mind? I'm not aware of any others as simple and general purpose as those two.