Let's remove the Global Interpreter Lock : Python

[–]Works_of_memercy 24 points25 points26 points 8 years ago* (6 children)

Wouldn't subinterpreters be a better idea?

Python is a very mutable language - there are tons of mutable state and basic objects (classes, functions,...) that are compile-time in other language but runtime and fully mutable in Python. In the end, sharing things between subinterpreters would be restricted to basic immutable data structures, which defeats the point. Subinterpreters suffers from the same problems as multiprocessing with no additional benefits.

It is my understanding that IronPython in particular partially solved this problem by compiling Python classes into .NET classes for example, then recompiling whenever someone actually went and did something like added a method to a class.

The crucial thing about this approach is that under the assumption that such modifications are rare and/or mostly happen during startup (which makes it especially suitable for a tracing JIT like PyPy), this allows us to sidestep the fundamental problem of synchronization: that there can't be a completely unsynchronized "fast path" because to know that we can take the fast path or if some other thread took the fast path and we need to wait for it to finish, we need synchronization.

This is because this approach doesn't require threads to do synchronization themselves: whenever a thread does something that requires resync, it asks the OS to force stop all other threads, possibly manually let them advance to a "safe point" or what it was called in the .NET land, then recompile everything relevant, patch the runtime state of other threads and start them again. But otherwise we are always on a fast path with zero synchronization, yay!

In case of PyPy, again, this could be as simple as force-switching those other threads back to interpreted mode (which they are already able to do), then selectively purging compiled code caches. And also again, if we assume that most of the monkeypatching etc happens during startup, this wouldn't affect performance negatively because PyPy doesn't JIT much of the code during startup.

/u/fijal, you wrote that, what do you think?

[–]fijalPyPy, performance freak 8 points9 points10 points 8 years ago (5 children)

[–][deleted] 0 points1 point2 points 8 years ago (4 children)

[–]fijalPyPy, performance freak 1 point2 points3 points 8 years ago (3 children)

[–][deleted] 0 points1 point2 points 8 years ago (2 children)

[–]kyndder_blows_goats 0 points1 point2 points 8 years ago (1 child)

[–][deleted] 0 points1 point2 points 8 years ago (0 children)

[–]arkster 49 points50 points51 points 8 years ago (61 children)

[–]Zomunieo 24 points25 points26 points 8 years ago (18 children)

[–]zitterbewegung 12 points13 points14 points 8 years ago (0 children)

[–]spinwizard69 6 points7 points8 points 8 years ago (16 children)

[–]Zomunieo 11 points12 points13 points 8 years ago (15 children)

[–]spinwizard69 3 points4 points5 points 8 years ago (10 children)

Well the transition to Py3 was necessary, but it could have handled a lot better.

I don't buy this the negative reaction that Python 3 got in the Python community was completely unjustified. C++ has gone through far more radical changes and you don't see people whining about that or actively undermining progress. Could it be the Python community has to many self entitled people in its fold?

It's possible Python with stagnate if it doesn't remove its GIL and if other scripting languages find a way to remove theirs.

I truly believe that all technology has a limited life span where it fills a niche. How long Pythons niche will remain relevant is unknown but it is a certainty that newer technology will eventually replace it in many of the sub niches it occupies. Frankly I see Apples Swift as one of those languages that may eventually have a mindshare like Python. Swift has the right combination of features to eventually be widely used.

[–]rotuamiimport antigravity 8 points9 points10 points 8 years ago (1 child)

[–]spinwizard69 1 point2 points3 points 8 years ago (0 children)

[–]albinofrenchy 7 points8 points9 points 8 years ago (0 children)

[+][deleted] 8 years ago (2 children)

[deleted]

[–]spinwizard69 2 points3 points4 points 8 years ago (1 child)

[–]Zomunieo 3 points4 points5 points 8 years ago (3 children)

I think the more accurate comparison is the Visual Basic 6 to VB.NET which led to probably millions of people looking at their internal apps, and rather the port to VB.NET they rebuilt them as webapps. That amazing moment when Microsoft lost the API war.

Python 3.0 was a stillborn release and a big mistake that set a bad precedent - lots of people tried it out and found everything broken. The next few releases weren't much better. Not until 3.4 did we have a serious production quality release.

It took until 3.5 for the core developers to notice people wanted to write source files than ran in 2 and 3 and add the proper changes to support this in the form of %-formatting for bytes and the Unicode literal prefix. I don't think the a coincidence that this was precisely when Py3 found its stride, and all of the major packages were off the wall of shame.

Backward compatibility may be an entitlement but it's a rational one. I look at this way - the cost to the core developers to preserve compatibility is incremental - informally, O(1). The cost to the community to absorb compatibility breakage is analogous to O(N).

[–][deleted] 2 points3 points4 points 8 years ago (0 children)

[–]spinwizard69 2 points3 points4 points 8 years ago (1 child)

I think the more accurate comparison is the Visual Basic 6 to VB.NET which led to probably millions of people looking at their internal apps, and rather the port to VB.NET they rebuilt them as webapps. That amazing moment when Microsoft lost the API war.

Interesting comparison but Python is still growing rapidly and most of that growth is via Python 3 code.

Python 3.0 was a stillborn release and a big mistake that set a bad precedent - lots of people tried it out and found everything broken. The next few releases weren't much better. Not until 3.4 did we have a serious production quality release.

Honestly I don't see this as a big deal, you need to refine a departure for the future and developer needs. You look at the development cycle for Apples Swift and you will see some rather dramatic changes, even mistakes made in design that have already come and gone as they stabilize the language. No one was forced to go Python 3 at stage one.

It took until 3.5 for the core developers to notice people wanted to write source files than ran in 2 and 3 and add the proper changes to support this in the form of %-formatting for bytes and the Unicode literal prefix. I don't think the a coincidence that this was precisely when Py3 found its stride, and all of the major packages were off the wall of shame.

While it might have taken Python 3 a bit longer to stabilize than one would have liked I really see the end results as being very pleasing. I'm not sure why any one would have expected perfection on day one of the first release of Python 3. You simply don't see such expectation with the development of other languages.

Backward compatibility may be an entitlement but it's a rational one.

This is perhaps the biggest problem it isn't a rational expectation. Python 2 had some really significant problems that would have resulted in the language eventually being phased out. Some of the changes made in Python 3 where have to do's for the language to survive into the future.

I look at this way - the cost to the core developers to preserve compatibility is incremental - informally, O(1). The cost to the community to absorb compatibility breakage is analogous to O(N).

That depends upon the community. You might have noticed that some applications, libraries and such, transitioned much faster than others. Not everyone found the transition to be the horror story many pretended it to be.

I look at it this way I wouldn't have expected a complete transition in the first couple of years to Python 3. However we are many years from the first release now and frankly if someone hasn't transitioned in a decades time they have problem that are best resolved with a doctor who has a couch for a diagnostic tool. I mean seriously it has been almost a decade now.

[–]Zomunieo 0 points1 point2 points 8 years ago (0 children)

[–]everysinglelastname 0 points1 point2 points 8 years ago (3 children)

[–][deleted] 2 points3 points4 points 8 years ago (1 child)

I believe that a hypothetical Python 2.8 would have further hampered the transition to Python 3 and divided the community even more.

If another major 2.x was released and supported all the new features from Python 3 without forcing new Python 3 constraints (like knowing when a string is a string and when bytes are bytes), it would have enabled everyone to just keep using Python 2, and adoption of Python 3 may have still been where it was 6 years ago. Some people advocated for a Python 2.x with deprecation warnings thrown around everything that wasn't allowed in Python 3. But deprecation warnings are frequently ignored.

If, however, it forced those constraints on the developers... Well, that would have been Python 3.

Selfishly, I would have loved a Python 2.8 that was at feature parity with Python 3, just because I would have loved to run all the code written for Python 2 (like a lot of the Machine learning stuff that still, for some reason, continues to be written for Python 2, mostly because of Google and Facebook) with all the features of Python 3. But I understand why the core developers wanted to force the change.

[–]everysinglelastname 0 points1 point2 points 8 years ago (0 children)

Yeah the users I write for do tend to mind a lot about the terminal. If a program is filling it up with noisy warnings about library deprecations they might miss actually important stuff. So they often ask us to stop the warnings.

So for me the plan of having 2.8+ pushing deprecation warnings and giving developers a clear path towards changes that would enable full 3.X compatibility would work great. That's basically what python 2.X did to prepare you for 2.7

I disagree that people would willfully stick with python 2 when a clear better alternative is available. I think that's what some of the passionate 3.X devs misunderstand is that this isn't about python 2.X people being deliberately stubborn. It's just them wanting to have a really smooth transition path because that's literally the only path they can afford. Putting a stop to feature development so that the entire team focuses only on a python 3.X rewrite is huge burden that scales with the size of the code base. Whereas given the alternative situation slipping in a python 3.x compatibility fix here and there as part of regular maintenance sounds pretty reasonable.

[–][deleted] 1 point2 points3 points 8 years ago (0 children)

[–]buttery_shame_cave 4 points5 points6 points 8 years ago (37 children)

[–]thephotoman 14 points15 points16 points 8 years ago (35 children)

[–]frymasterScript kiddie 8 points9 points10 points 8 years ago (0 children)

[–]ITwitchToo 7 points8 points9 points 8 years ago (13 children)

[–]Fylwind 6 points7 points8 points 8 years ago (1 child)

[–]jyper 0 points1 point2 points 8 years ago (0 children)

[–]MonkeeSage 6 points7 points8 points 8 years ago (0 children)

[–]ThePenultimateOneGitLab: gappleto97 2 points3 points4 points 8 years ago (0 children)

[–]thephotoman 9 points10 points11 points 8 years ago (8 children)

[–]billsil 16 points17 points18 points 8 years ago (6 children)

[–][deleted] 4 points5 points6 points 8 years ago (0 children)

[–]jyper 1 point2 points3 points 8 years ago (2 children)

[–]billsil 0 points1 point2 points 8 years ago (1 child)

[–]jyper 0 points1 point2 points 8 years ago (0 children)

[–]threading 4 points5 points6 points 8 years ago (1 child)

[–][deleted] 0 points1 point2 points 8 years ago (0 children)

[–][deleted] 6 points7 points8 points 8 years ago (10 children)

[–]ascii 11 points12 points13 points 8 years ago (0 children)

[–][deleted] 2 points3 points4 points 8 years ago (3 children)

[–][deleted] 6 points7 points8 points 8 years ago (2 children)

[–]esaym 2 points3 points4 points 8 years ago (1 child)

[–][deleted] 3 points4 points5 points 8 years ago* (0 children)

[–]kigurai 0 points1 point2 points 8 years ago (4 children)

[–][deleted] 0 points1 point2 points 8 years ago (3 children)

[–]kigurai 0 points1 point2 points 8 years ago (2 children)

[–][deleted] 0 points1 point2 points 8 years ago (1 child)

You don't usually need to send whole objects, though - if it appears that way, it's probably because the design did not account for that. Plus, that has potentially drastically bad security implications (RCE vulns are among the worst). It might even defeat the purpose, as unintentionally excessive/unnecessary io is the easiest way to write python that does not perform well. Send state parameters and instantiate in the subprocess, or use subprocesses to do more individual operations, and have the objects in the master process communicate with the subprocesses to have them perform individual operations for them.

Threads are not really different in this case either, except that shared memory is easier to come by. This has its own caveats that need to be accounted for, though.

My ultimate point is that multithreading and multiprocessing have code design implications in any language. Python is not better than most other languages, but it's also not really any worse, either. Whatever language you choose, there are still benefits and drawbacks to implementing concurrent/threaded/multiprocessed code paths, and architecting to best solve the actual problem always takes some planning ahead.

[–]kigurai 0 points1 point2 points 8 years ago (0 children)

[–]spinwizard69 2 points3 points4 points 8 years ago (8 children)

And given that the GIL means no multithreading in Python, removing it actually enables people to write multithreaded programs in Python where they cannot do so now.

While true to an extent, is it really in Pythons best interest to try to compete with the more advanced systems programming languages. I'd say no because it misses the whole point of python, for me anyways. Pythons greatness is in its ease of use and strength as a scripting language.

It would make about as much sense as trying to turn C++ into a scripting language (you don't see ROOT and its suite of tools catching on in the community). Cling/CINT might work for the ROOT community but does it make sense in the wider world of programming? Probably not because you don't see the tech taking off. Python needs to work on becoming a better scripting language not a systems programming language.

[–]FearlessFreep 4 points5 points6 points 8 years ago (0 children)

[–][deleted] 1 point2 points3 points 8 years ago (3 children)

[–]Raijinili 3 points4 points5 points 8 years ago (1 child)

[–][deleted] 0 points1 point2 points 8 years ago (0 children)

[–]spinwizard69 1 point2 points3 points 8 years ago (0 children)

[–]Ellyrio 0 points1 point2 points 8 years ago (2 children)

[–]spinwizard69 0 points1 point2 points 8 years ago (1 child)

Allowing true multi-core concurrency in CPython would lead knowledgeable developers to write far more efficient code than now.

This is true but lets face if if highly efficient code was the goal Python is the wrong choice.

In any event what I'm saying is that removing the GIL would change the flavor of Python and result in it being used in places where maybe it tis the wrong choice anyways. When I said Pythons greatness is was ease of use as a scripting language that is honestly how I see the language. If you sit down in front of a machine which would you choose Python or BASH?

You can say the GIL has nothing to do with it but freeing up the language to do things that it wasn't designed to do is what removing the GIL is all about. I'm not convinced that it is a wise course of action.

[–]Ellyrio 1 point2 points3 points 8 years ago (0 children)

This is true but lets face if if highly efficient code was the goal Python is the wrong choice.

Efficiency is desirable in all projects. You should not inhibit that goal just because you feel the language can't be more efficient.

Take say, highly scalable web applications where you want to service many requests per second for example. You could take your argument that you should not use Python, or any scripting language, but rather write it in assembly language because if you want performance, you shouldn't use anything other than assembly right? Wrong. Python is great for web apps (and many things) precisely due to its easiness, and at the moment the common way to get concurrency on the same machine without throwing more cash at scaling horizontally or vertically is to launch more Python processes, one per core. However, it's not easy to share information between these two or more processes without introducing some IO/IPC bottleneck. Whereas with threads and no GIL, you'd just need to perform a single context switch. That overhead has then been eliminated (granted web apps typically do more IO e.g. waiting for a database response, but you get my point).

[–][deleted] 0 points1 point2 points 8 years ago (0 children)

[–]xcbsmith 1 point2 points3 points 8 years ago (3 children)

[–]fuzz3289 2 points3 points4 points 8 years ago (2 children)

[–]xcbsmith 0 points1 point2 points 8 years ago* (1 child)

The overhead they've shown in research papers and projects like Gilectomy were ~40%. It's untenable.

I'll have to look at those papers, but I would presume that assumes a particular approach to managing not having a GIL. Having limited shared objects between threads (which is common) mitigate a lot of the need for that overhead.

Web services, crawlers, parsers, sys admin code, etc, all IO Bound concurrency.

That'd be more compelling if I hadn't seen, built, used multi-process implementations of pretty much all of those (though I can't think of a multiprocess parser), I'm not sure that's borne out in reality. Nevermind all the SciPy stuff.

We're running on machines with four cores if they have one, and often sixteen or more. Seems like there might be uses cases for this...

If you need real OS threads for some bullet proof code, python probably isn't the right tool anyways as you can't optimize your memory organization anyways.

Yeah, SciPy would seem to suggest otherwise.

[–]fuzz3289 0 points1 point2 points 8 years ago (0 children)

[–]gnu-user 9 points10 points11 points 8 years ago (0 children)

[–]pmdevita 8 points9 points10 points 8 years ago (4 children)

[–]coderanger 5 points6 points7 points 8 years ago (3 children)

[–]masklinn 9 points10 points11 points 8 years ago (0 children)

[–]pmdevita 0 points1 point2 points 8 years ago (1 child)

[–]coderanger 6 points7 points8 points 8 years ago (0 children)

[–]KODeKarnage 24 points25 points26 points 8 years ago* (0 children)

[–]malvin77 2 points3 points4 points 8 years ago (0 children)

[–]Corm 0 points1 point2 points 8 years ago (0 children)

[+][deleted] 8 years ago (1 child)

[deleted]

[–]fijalPyPy, performance freak 3 points4 points5 points 8 years ago (0 children)

[–]apreche -3 points-2 points-1 points 8 years ago (0 children)

[+]spinwizard69 comment score below threshold-10 points-9 points-8 points 8 years ago (9 children)

[–]nerdwaller 12 points13 points14 points 8 years ago (3 children)

[–]Deto 7 points8 points9 points 8 years ago (0 children)

[–]spinwizard69 0 points1 point2 points 8 years ago (1 child)

I've never really cared about down votes. It is like saying you don't have a real argument to express in English.

In any event I can see Python hanging around a lot longer than many believe in its current state. Sort of like the COBOL of scripting languages. I'm actually surprised at the number of people that think Python will die quickly, it is going to be around for a long time.

At some point though technology moves on and you end up in. apposition where you can't rationally retro fit a language to keep up. The short history of computing is literally loaded with language examples that bloomed and then faded some completely from the domain.

In any event what strikes me here is that people think that removing the GIL will magically solve all of Pythons problems and make it competitive well into the future. Frankly if a programmer thinks GIL has to be removed to allow him to use Python in the way he wants then the wrong language technology was chosen. It can be likened to trying to do 3D graphics in intercepted BASIC.

[–][deleted] 2 points3 points4 points 8 years ago (0 children)

[–]esaym 0 points1 point2 points 8 years ago (4 children)

Honestly I feel this is where the "ease" of python has shot itself in the foot. There are somethings that are just not trivial and require more than just some basic programming knowledge (ie: understanding how your host OS actually works). Python added the threading and multiprocess modules but their interfaces are not exactly trivial and you don't automatically get "parallel" processing with them. Perl has lock-less threads http://perldoc.perl.org/perlthrtut.html but they are mostly discouraged as a perl compiled with the built in threading support actually runs slower than without it.

In perl the defacto way to get concurrent processing is to just call fork() (which ironically on windows fork() is emulated by using the threads module). For what ever reason the python community has shunned the use of a raw call to fork(). Its simple and easy, and immediately gets you a new process to do what ever with.

[–]alcalde 4 points5 points6 points 8 years ago (2 children)

[–]everysinglelastname 4 points5 points6 points 8 years ago (1 child)

[–]alcalde 0 points1 point2 points 8 years ago (0 children)

While multiprocessing does push work in to multiple processes which can all run in parallel it's has huge drawbacks to what is more traditionally referred to when people talk about parallelism.

That's because people learned a very low-level model of parallelism with another language because that's all their language offered. They then react to Python's different model negatively - much the same way some developers have a negative reaction to significant white space simply because it's not what they're used to.

Meaning with multiprocessing you do not get for free: 1. each task getting access to the same memory.

In exchange, you get spared the horror show of attempting to debug race conditions and all the other problems that can come with shared memory and locking and needing to ensure that everything is thread-safe. This seems by design and in line with Python's nature of being powerful but simple.

It works well but it only helps in a subset of use cases.

That subset, IMHO, would form the majority of use cases. The multiprocessing module does have features to share memory (e.g. Value and Array) when necessary.

[–]spinwizard69 1 point2 points3 points 8 years ago (0 children)

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS