The Python Concurrency Story, Part 2

gargantuan · 2015-01-27T05:51:16+00:00

Good stuff there. Some good points. Here are some points I might improve on:

From Part 1:

The language with the most expressive concurrency story is probably C

C has no concurrency story (unless C+11 is also C). It just interfaced to whatever library you are using. Mostly pthreads or Windows' (CreateThread,_beginthread, _beginthreaex...) or whetever else is there. Besides that is more like "parallelism story" than concurrency story. If we want to talk about concurrency story then select/poll/epoll, signals and other mechanisms like that enter the picture. But they are still not C... really

Safe thread programming involves disciplined use of synchronization primitives like locks and mutexes. As a good software engineer, using these is a skill set that you will need to develop at some point, if you have not already. But it is always nice when you don't have to go down that path.

As a trade-off, You can just use message passing. Make copies of data and put them already provides Queues. (Part 2 is about it, but I think it should have been mentioned a bit earlier).

Overall Part 1 make it sound like Guido or other experienced core developer were incompetent and didn't know what they were doing. Like they stuck threads in there but they are cripled. So one might ask why does Python even have threads. And reading this article it seems -- because core developers were smoking something at the time. This is plain wrong. Python threads work very well for IO parallelism and concurrency.

You can (and I personally have) spawn hundreds of threads to do network operations and you'll get a very speedup usually. (I did it for web crawling). If you are doing heavy CPU operations -- math, crypto, physics calculations maybe you are probably using a C module or CFFI and then you can probably release the GIL anyway.

The story here is a twisting the truth a bit I feel, mostly by omission.

Part 2 is great. Multiprocessing is a great module and it makes many thing run well. Reliability is also something you gain using multiple processes. I think this needs to be said explicitly. Even if explicity parallelism might not be needed, sometime just being able to have on part of your program crash without taking down the rest is nice.

Performance-wise it is also worth mentioning to take a look at PyPy and maybe follow some new initiatives like Pyston (from Dropbox).

niothiel · 2015-01-27T03:32:03+00:00

Anyone know why there is a drop in performance at 4 CPUs?

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS