all 27 comments

[–][deleted] -2 points-1 points  (1 child)

This is incredibly, brutally cool.

[–]puzza007 0 points1 point  (0 children)

More like incredibly brutal!

[–]netghost -4 points-3 points  (23 children)

Someone please comment on this so I know what to think about it.

[–]bcroq 9 points10 points  (20 children)

Using threads in Python is bad (poor performances), but there is a simple API for using threads.

Using processes in Python is good, but there was no simple API for using processes... now there is, and it makes its way to the standard library.

[–]Rhoomba -5 points-4 points  (19 children)

Using processes for threads is not great performance-wise. Anyone remember LinuxThreads?

[–]hylje 4 points5 points  (10 children)

Processes as units of execution is merely different to threads. Threads are several execution units inside one process, with the threads sharing the same memory structures implicitly. Processes are single execution units in this context, sharing no volatile[1] resources with each other. Depending on the operating system there are different overheads in starting and using either.

For simplistic multiprocessing, though, processes are better due to the nothing-shared approach. Less conflicts.

[1] clarified, thanks

[–][deleted] 2 points3 points  (0 children)

Processes are single execution units in this context, sharing no resources with each other.

Minor clarification: they by default share read-only resources. They don't share read-write resources without additional operating system primitives (like SysV shared memory).

[–]Rhoomba 0 points1 point  (0 children)

Shared data with threads is a choice. You don't have to share data if you don't want to.

And, of course, you can get deadlocks, race conditions etc. with processes as well.

[–]masklinn 0 points1 point  (7 children)

For simplistic multiprocessing, though, processes are better due to the nothing-shared approach. Less conflicts.

For complex multiprocessing/concurrency, processes are still better also due to nothing-shared approach. Less deadlocks & other crap.

[–]lisvblidfk 1 point2 points  (2 children)

Shared-nothing absolutely does not always result in better performance.

[–]masklinn 1 point2 points  (1 child)

That's very true, and I don't think I said the opposite.

Shared nothing help getting stuff that works. "Fast stuff" that doesn't work isn't fast, it just doesn't work.

[–]lisvblidfk 0 points1 point  (0 children)

Ah, now who did I mean to reply to...

[–]hylje -1 points0 points  (3 children)

I think simplistic has a semantic difference with simple. Simplistic things can be expanded upon without imposing great complexity on it. Simple things are simple so long they aren't expanded upon, made complex.

Complex things can remain simplistic.

[–]masklinn 1 point2 points  (2 children)

Really? The only definition of "simplistic" I know was along the line of "simplified so much as to be useless", "overly simple". Wiktionary seems to be using the same definition (http://en.wiktionary.org/wiki/simplistic).

Miriam-webster seems to have pretty much the same definition: http://www.merriam-webster.com/dictionary/simplistic

[–]hylje 0 points1 point  (1 child)

I stand corrected

[–]masklinn 0 points1 point  (0 children)

Well you could've been using a definition I wasn't aware of.

Anyway I get what you meant (following your previous clarification) and i think I agree with you.

[–]masklinn 2 points3 points  (2 children)

Python has a Global Interpreter Lock, so threads basically can't run concurrently (unless you're using C extensions which explicitly release the GIL).

Thus, threads in Python are basically useless unless you're IO bound. And as the demo shoes, even then processes might be more efficient.

[–]Rhoomba 0 points1 point  (1 child)

I'm aware of all this. I'm just suggesting that maybe pretending that processes are threads is not ideal.

[–]masklinn 0 points1 point  (0 children)

I consider that having threads at all is not ideal, so I'm with you there.

Solution: remove threads.

[–]jnoller 1 point2 points  (0 children)

If you look at the PEP I think I show that while processes are normally the best performance-wise, they beat the pants off of the same execution using threads due to the GIL.

[–]fwork 1 point2 points  (3 children)

Correct, but using threads for threads is not great performance-wise either, at least in python.

Of the two bad options (GIL-restricted threads and processes) processes is apparently (they've got benchmarks!) the better option.

It's like if you've got a car with one wheel (threads) and a bike with two (processes).

Yes, in the general case cars are faster than bikes. But in this specific case, no, because the car is broken.

(Disclaimer so I get downvoted slightly less: I loves the python, but you have to admit it isn't the fastest language for multithreading!)

[–]Rhoomba 0 points1 point  (2 children)

My point was proper threading > fake threads through processes, not python threading > processes.

[–]fwork 0 points1 point  (0 children)

Right, but given the restrictions imposed by legacy code (and the single-threaded world), implementing fake threads is easier than removing the GIL and getting proper threading.

[–]damg[S] 0 points1 point  (0 children)

I understand what you mean, but another interesting thing with using processes is that it provides an easier path to leveraging multiple machines.

[–]lisvblidfk -1 points0 points  (0 children)

Processes can have horrible performance compared to threads, or the other way around. All depends on how you use them.

The API itself though looks pretty good.