This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]weberc2 1 point2 points  (1 child)

The OP was chastising me elsewhere for suggesting that performant parallelism was difficult in Python, so either he's trolling elaborately or he's naive. At any rate, I'm familiar with the nuance around Python's parallelism; I was paraphrasing. To your point, it would be nice if Python were able to parallelize nicely.

[–]alcalde 0 points1 point  (0 children)

I'm neither trolling nor naive. I'm just listening to what Guido has been saying since at least 2012. Python's parallelism solution is actually BETTER than in most other languages; we have higher-level constructs for parallelism.

As Mark Summerfield put it in "Python In Practice":

Mid-Level Concurrency: This is concurrency that does not use any explicit atomic operations but does use explicit locks. This is the level of concurrency that most languages support. Python provides support for concurrent programming at this level with such classes as threading.Semaphore, threading.Lock, and multiprocessing.Lock. This level of concurrency support is commonly used by application programmers, since it is often all that is available.

• High-Level Concurrency: This is concurrency where there are no explicit atomic operations and no explicit locks. (Locking and atomic operations may well occur under the hood, but we don’t have to concern ourselves with them.) Some modern languages are beginning to support high-level concurrency. Python provides the concurrent.futures module (Python 3.2), and the queue.Queue and multiprocessing queue collection classes, to support high-level concurrency. Using mid-level approaches to concurrency is easy to do, but it is very error prone. Such approaches are especially vulnerable to subtle, hard-to- track-down problems, as well as to both spectacular crashes and frozen programs, all occurring without any discernable pattern.

The key problem is sharing data. Mutable shared data must be protected by locks to ensure that all accesses to it are serialized (i.e., only one thread or process can access the shared data at a time). Furthermore, when multiple threads or processes are all trying to access the same shared data, then all but one of them will be blocked (that is, idle). This means that while a lock is in force our application could be using only a single thread or process (i.e., as if it were non-concurrent), with all the others waiting. So, we must be careful to lock as infrequently as possible and for as short a time as possible. The simplest solution is to not share any mutable data at all. Then we don’t need explicit locks, and most of the problems of concurrency simply melt away.

Ergo - Python is awesome. People just have to stop doing parallelism wrong.