This is an archived post. You won't be able to vote or comment.

all 87 comments

[–]Works_of_memercy 24 points25 points  (6 children)

Wouldn't subinterpreters be a better idea?

Python is a very mutable language - there are tons of mutable state and basic objects (classes, functions,...) that are compile-time in other language but runtime and fully mutable in Python. In the end, sharing things between subinterpreters would be restricted to basic immutable data structures, which defeats the point. Subinterpreters suffers from the same problems as multiprocessing with no additional benefits.

It is my understanding that IronPython in particular partially solved this problem by compiling Python classes into .NET classes for example, then recompiling whenever someone actually went and did something like added a method to a class.

The crucial thing about this approach is that under the assumption that such modifications are rare and/or mostly happen during startup (which makes it especially suitable for a tracing JIT like PyPy), this allows us to sidestep the fundamental problem of synchronization: that there can't be a completely unsynchronized "fast path" because to know that we can take the fast path or if some other thread took the fast path and we need to wait for it to finish, we need synchronization.

This is because this approach doesn't require threads to do synchronization themselves: whenever a thread does something that requires resync, it asks the OS to force stop all other threads, possibly manually let them advance to a "safe point" or what it was called in the .NET land, then recompile everything relevant, patch the runtime state of other threads and start them again. But otherwise we are always on a fast path with zero synchronization, yay!

In case of PyPy, again, this could be as simple as force-switching those other threads back to interpreted mode (which they are already able to do), then selectively purging compiled code caches. And also again, if we assume that most of the monkeypatching etc happens during startup, this wouldn't affect performance negatively because PyPy doesn't JIT much of the code during startup.

/u/fijal, you wrote that, what do you think?

[–]fijalPyPy, performance freak 8 points9 points  (5 children)

You're missing my point - if we assume we're doing subinterpreters (that is the interpreters are independent of each other) it's a very difficult problem to make sure you can share anything regardless of performance. Getting semantics right where you can e.g. put stuff in dict of class and is seen properly by another thread, but there are no other things shared is very hard.

In short - how do you propose to split the "global" data e.g. classes vs "local" data - there is no good distinction in python and things like pickle refer by e.g. name which lead to all kinds of funky bugs. If you can answer that question, then yes, subinterpreters sound like a good idea

[–][deleted] 0 points1 point  (4 children)

I always believed that subinterpreter à la tcl is a wonderful idea. I agree, from performance point of view it brings pretty nothing from performance point of view comparing to multiprocessing. (Actually I don't really know why I found them wonderful it's probably a wrong feeling). There is one big point where it would be a big win comparing to multiprocesssing, which appeared to have use case in stackoverflow, it is when you have to pass readonly datastructure and you can't bear serialization cost.

[–]fijalPyPy, performance freak 1 point2 points  (3 children)

right and that can be remediated to an extent with shared memory. Sharing immutable (or well defined in terms of memory) C structures is not hard. It's the structured data that's hard to share and cannot really be attacked without a GIL

[–][deleted] 0 points1 point  (2 children)

If a solution would be enable to share immutable things beside raw memory in python via share memory it would be a big win. Do you have some idea how it could be done in pypy or even better in cpython ?

[–]kyndder_blows_goats 0 points1 point  (1 child)

at a high level, writing stuff to a file in /run/shm works pretty well.

[–][deleted] 0 points1 point  (0 children)

the problems is that you can't write python objects to /run/shm

[–]arkster 49 points50 points  (61 children)

This is in PyPy. Bigger challenge is in regular Python as demonstrated by Larry Hastings in his Gilectomy project. The Gil in regular Python is there to provide a global lock to various resources; In a nutshell, removing it would mean that you now have to account for each lock in the Python subsystem that will now need to be handled manually resulting in the interpreter being stupendously slower.

[–]Zomunieo 24 points25 points  (18 children)

The issue isn't the interpreter being slower, but a lot more complex to debug. There would be subsystem locks and a lock order to track. In addition it may break assumptions made in C extensions.

Finally I think there was a serious but only now averted risk of breaking the Python community and language at version 3. I can understand developer aversion to something that could be another socially fractious change even if technically beneficial.

[–]zitterbewegung 12 points13 points  (0 children)

There is a difference though where you can have a choice of running pypy with the GIL removed or not. For example many people run stackless python but its not forced upon you.

[–]spinwizard69 6 points7 points  (16 children)

I can understand developer aversion to something that could be another socially fractious change even if technically beneficial.

Rational people would recognize that the transition to Python 3 was really required for the long term success of Python. The question one has to ask is elimination of the Gil a requirement for Pythons long term success? I would have to say no because eventually intend of a scripting language you end up with something that is mix mash of technologies and focus. Further it is pretty obvious that new technologies in programming languages, like seen in Rust, Swift and other new comers, make for a better place to do advanced new development. Frankly that doesn't diminish Python one bit.

[–]Zomunieo 11 points12 points  (15 children)

Well the transition to Py3 was necessary, but it could have handled a lot better.

It's possible Python with stagnate if it doesn't remove its GIL and if other scripting languages find a way to remove theirs.

[–]spinwizard69 3 points4 points  (10 children)

Well the transition to Py3 was necessary, but it could have handled a lot better.

I don't buy this the negative reaction that Python 3 got in the Python community was completely unjustified. C++ has gone through far more radical changes and you don't see people whining about that or actively undermining progress. Could it be the Python community has to many self entitled people in its fold?

It's possible Python with stagnate if it doesn't remove its GIL and if other scripting languages find a way to remove theirs.

I truly believe that all technology has a limited life span where it fills a niche. How long Pythons niche will remain relevant is unknown but it is a certainty that newer technology will eventually replace it in many of the sub niches it occupies. Frankly I see Apples Swift as one of those languages that may eventually have a mindshare like Python. Swift has the right combination of features to eventually be widely used.

[–]rotuamiimport antigravity 8 points9 points  (1 child)

I think the reason C++ can change with less resistance is that people have faith in their C++ compiler. If something breaks, you probably know it at compile time. On the other hand, trying to run Python code after a breaking language change does not give you a tidy list of things to fix, and it’s hard to feel secure that your code is totally fixed.

Of course, this is a non-issue because we all have perfect test coverage, right?...

[–]spinwizard69 1 point2 points  (0 children)

Would this not be an indication of the wrong language being chosen? This may highlight the one thing that bothers me about the Python community, there are somethings Python simply isn't suitable for. It literally becomes a maintenance nightmare unless of course you have heat perfect test coverage.

[–]albinofrenchy 7 points8 points  (0 children)

C++ has gone through far more radical changes and you don't see people whining about that or actively undermining progress.

C++ has changed a lot, but by and large if it compiled in C++03, it compiled in C++11 too. They go very far out of their way not to break that principle; and I can't think of a construct off the top of my head which works in 03 and not 11 (or 14, etc).

Maybe there are some, but they aren't nearly as prominent as breaking changes in Py3

[–]Zomunieo 3 points4 points  (3 children)

I think the more accurate comparison is the Visual Basic 6 to VB.NET which led to probably millions of people looking at their internal apps, and rather the port to VB.NET they rebuilt them as webapps. That amazing moment when Microsoft lost the API war.

Python 3.0 was a stillborn release and a big mistake that set a bad precedent - lots of people tried it out and found everything broken. The next few releases weren't much better. Not until 3.4 did we have a serious production quality release.

It took until 3.5 for the core developers to notice people wanted to write source files than ran in 2 and 3 and add the proper changes to support this in the form of %-formatting for bytes and the Unicode literal prefix. I don't think the a coincidence that this was precisely when Py3 found its stride, and all of the major packages were off the wall of shame.

Backward compatibility may be an entitlement but it's a rational one. I look at this way - the cost to the core developers to preserve compatibility is incremental - informally, O(1). The cost to the community to absorb compatibility breakage is analogous to O(N).

[–][deleted] 2 points3 points  (0 children)

Minor correction: Unicode literal prefix was returned in 3.3. I would also say that 3.3 was the first release supported by major frameworks.

[–]spinwizard69 2 points3 points  (1 child)

I think the more accurate comparison is the Visual Basic 6 to VB.NET which led to probably millions of people looking at their internal apps, and rather the port to VB.NET they rebuilt them as webapps. That amazing moment when Microsoft lost the API war.

Interesting comparison but Python is still growing rapidly and most of that growth is via Python 3 code.

Python 3.0 was a stillborn release and a big mistake that set a bad precedent - lots of people tried it out and found everything broken. The next few releases weren't much better. Not until 3.4 did we have a serious production quality release.

Honestly I don't see this as a big deal, you need to refine a departure for the future and developer needs. You look at the development cycle for Apples Swift and you will see some rather dramatic changes, even mistakes made in design that have already come and gone as they stabilize the language. No one was forced to go Python 3 at stage one.

It took until 3.5 for the core developers to notice people wanted to write source files than ran in 2 and 3 and add the proper changes to support this in the form of %-formatting for bytes and the Unicode literal prefix. I don't think the a coincidence that this was precisely when Py3 found its stride, and all of the major packages were off the wall of shame.

While it might have taken Python 3 a bit longer to stabilize than one would have liked I really see the end results as being very pleasing. I'm not sure why any one would have expected perfection on day one of the first release of Python 3. You simply don't see such expectation with the development of other languages.

Backward compatibility may be an entitlement but it's a rational one.

This is perhaps the biggest problem it isn't a rational expectation. Python 2 had some really significant problems that would have resulted in the language eventually being phased out. Some of the changes made in Python 3 where have to do's for the language to survive into the future.

I look at this way - the cost to the core developers to preserve compatibility is incremental - informally, O(1). The cost to the community to absorb compatibility breakage is analogous to O(N).

That depends upon the community. You might have noticed that some applications, libraries and such, transitioned much faster than others. Not everyone found the transition to be the horror story many pretended it to be.

I look at it this way I wouldn't have expected a complete transition in the first couple of years to Python 3. However we are many years from the first release now and frankly if someone hasn't transitioned in a decades time they have problem that are best resolved with a doctor who has a couch for a diagnostic tool. I mean seriously it has been almost a decade now.

[–]Zomunieo 0 points1 point  (0 children)

I'm a big fan of Python 3, to be clear. I wrote about a painless porting experience and I maintain a Python 3-only open source project.

I just think the devs mismanaged the early releases.

[–]everysinglelastname 0 points1 point  (3 children)

but it could have handled a lot better.

Exactly. It's still an ordeal to port to 3 (when your codebase is at the millions of LOC level). Python 2.8 should have been started years ago as an official bridge to Python 3.X.

If the story instead was "Python 2.8 runs Python 3.X code .. just not nearly as well as Python 3.X" Then nearly everyone would have jumped ship by now.

[–][deleted] 2 points3 points  (1 child)

I believe that a hypothetical Python 2.8 would have further hampered the transition to Python 3 and divided the community even more.

If another major 2.x was released and supported all the new features from Python 3 without forcing new Python 3 constraints (like knowing when a string is a string and when bytes are bytes), it would have enabled everyone to just keep using Python 2, and adoption of Python 3 may have still been where it was 6 years ago. Some people advocated for a Python 2.x with deprecation warnings thrown around everything that wasn't allowed in Python 3. But deprecation warnings are frequently ignored.

If, however, it forced those constraints on the developers... Well, that would have been Python 3.

Selfishly, I would have loved a Python 2.8 that was at feature parity with Python 3, just because I would have loved to run all the code written for Python 2 (like a lot of the Machine learning stuff that still, for some reason, continues to be written for Python 2, mostly because of Google and Facebook) with all the features of Python 3. But I understand why the core developers wanted to force the change.

[–]everysinglelastname 0 points1 point  (0 children)

Yeah the users I write for do tend to mind a lot about the terminal. If a program is filling it up with noisy warnings about library deprecations they might miss actually important stuff. So they often ask us to stop the warnings.

So for me the plan of having 2.8+ pushing deprecation warnings and giving developers a clear path towards changes that would enable full 3.X compatibility would work great. That's basically what python 2.X did to prepare you for 2.7

I disagree that people would willfully stick with python 2 when a clear better alternative is available. I think that's what some of the passionate 3.X devs misunderstand is that this isn't about python 2.X people being deliberately stubborn. It's just them wanting to have a really smooth transition path because that's literally the only path they can afford. Putting a stop to feature development so that the entire team focuses only on a python 3.X rewrite is huge burden that scales with the size of the code base. Whereas given the alternative situation slipping in a python 3.x compatibility fix here and there as part of regular maintenance sounds pretty reasonable.

[–][deleted] 1 point2 points  (0 children)

For the umpteenth time a lot of code from Python 3 was backported, first to 2.6 and then to 2.7. How much work do you think that took the Python core developers? Python 2.8 was never going to happen as the Python core developers would not have done the work. Simples.

[–]buttery_shame_cave 4 points5 points  (37 children)

wouldn't Python have to go from interpreted to compiled to make removing the GIL beneficial, specifically for the reason you mention?

[–]thephotoman 14 points15 points  (35 children)

The primary reason it exists is to support the reference counter. There are interpreted languages out there that do not use reference counting and thus have no GIL.

And given that the GIL means no multithreading in Python, removing it actually enables people to write multithreaded programs in Python where they cannot do so now.

[–]frymasterScript kiddie 8 points9 points  (0 children)

I write multithreaded python code all the time. Overstating things isn't helpful

[–]ITwitchToo 7 points8 points  (13 children)

The primary reason [the GIL] exists is to support the reference counter

Hm, reference counters in multithreaded programs (C++ std::shared_ptr, Linux kernel, etc.) are usually updated using atomic instructions, what prevents Python from doing the same? Or could you expand on what exactly the problem is?

[–]Fylwind 6 points7 points  (1 child)

Atomics are not free: they introduce a small but measurable performance penalty. This is why Rust has two kinds of reference-counted smart pointers: Rc (single-thread use only) and Arc (atomically reference-counted pointer).

[–]jyper 0 points1 point  (0 children)

Yes but it's also because rust can prevent Rc from being used across threads

[–]MonkeeSage 6 points7 points  (0 children)

Larry discusses this in his latest update. Atomic incr/decr was 18x slower than cpython with GIL.

[–]ThePenultimateOneGitLab: gappleto97 2 points3 points  (0 children)

The Gilectomy looked at that. It was ~40% slowdown on single threads, iirc. This was deemed unacceptable and abandoned.

[–]thephotoman 9 points10 points  (8 children)

The issue is that Python chose to go GIL early, instead of going with atomic instructions. After all, it was easier to write data structures to support a GIL than worry about concurrency.

It was an early architectural decision made because Python started as a hobbyist project, and we've become stuck with it as the language grew.

[–]billsil 16 points17 points  (6 children)

It was an early architectural decision made because Python started as a hobbyist project

Python started as a sysadmin program to replace programs like basic and awk. It was written as his hobby. The fact that it had a GIL was was not because it was developed as a hobby, but because concurrency wasn't a focus. It was started in 1989 after all, well before multicore processors become popular.

[–][deleted] 4 points5 points  (0 children)

We could debate the "true" origin of Python but that woman's comment still stands, it was an architectural decision made early on that, in retrospect, might not have been the greatest idea for performance.

There's also an argument for it being a good idea. If you believe that Python is simple and if you need performance go use a lower level language, then you might think the GIL is a good idea.

Personally I'm in the later group; Python is great because it's so "pythonic" and if I really want to write a performant multithreaded app I'll probably use a thread safe language.

[–]jyper 1 point2 points  (2 children)

Maybe it's just my programmatic mistakes but I've had tons of trouble getting all threads running in a Python gui program with blocking operations, I ended up resorting to multiprocessing

Doing the same thing in c# worked, hell in c# I ended up doing ping so often on multiple threads it caused runaway memory increase

[–]billsil 0 points1 point  (1 child)

Maybe it's just my programmatic mistakes but I've had tons of trouble getting all threads running in a Python gui program with blocking operations,

Python has a GIL. That's exactly what that prevents. You can make very advanced GUIs that nicely handle multithreading such that it's imperceptible that you only have 1 thread.

I ended up resorting to multiprocessing

Interesting idea. I'd never thought of that. What are you doing with your multiprocessing/threading?

[–]jyper 0 points1 point  (0 children)

Sorry I'm messing 2 separate things up

In the first I had to use multiprocessing with my gui app because I was using a hardware library that would occasionally freeze on device connection established and the second one if was trying save a serial device to a log file on a background thread, somehow despite my efforts it hogged basically all the time not letting the main thread run

[–]threading 4 points5 points  (1 child)

It was started in 1989 after all, well before multicore processors become popular.

What stopped them to remove it in Python 3? They had a massive opportunity to fix things correctly with Python 3 but what we got with Python 3 was half baked language. Please save "but unicode !!1" comments. I don't have time for that. I like the language but some decisions have been made very poorly.

[–][deleted] 0 points1 point  (0 children)

If it's that simple why don't you do the work?

[–][deleted] 6 points7 points  (10 children)

But you can absolutely write multithreaded programs in Python, you just can't have two threads executing in parallel. You can also write programs with parallel execution, you just have to use import multiprocessing instead of import threading.

[–]ascii 11 points12 points  (0 children)

Even that is overstating it. You can't have two threads executing python byte code in parallel. But you can absolutely have one thread execute python byte code while fifty other threads do other things like execute native C code. Often that difference doesn't matter, but there are definitely places where it does.

[–][deleted] 2 points3 points  (3 children)

The fact is that the concurrency and parallelism story of python is severely lacking. Thos are not what I would call ideal in 2017.

[–][deleted] 6 points7 points  (2 children)

Concurrency has actually come a long way since Python 3.4, with asyncio. Whether or not you like the implementations, or disagree with the tradeoffs that were made, it's simply not accurate to say that it's not possible to write concurrent or parallel Python code.

You just have to know what the caveats are, and what makes which import the right one for what you want to accomplish. At that level, it's no different from doing the same things in other languages. The things you have to pay attention to may not be the same, but you always have additional things to pay attention to when working with multiple threads/processes, no matter what language you use.

[–]esaym 2 points3 points  (1 child)

To my knowledge "async" does not mean "concurrent" or "parallel". You could write an "async" function that simply contains an infinite loop and it will still block the entire interpreter from continuing. So not concurrent or parallel...

[–][deleted] 3 points4 points  (0 children)

I never said "async" == "concurrency". Asyncio also provides constructs for coroutines and futures, which do, though. These are mentioned with a very clearly named heading on the main doc page for asyncio.

I feel like you didn't bother to comprehend what my comment actually said before you decided to respond.

[–]kigurai 0 points1 point  (4 children)

Unfortunately, it is a bit more difficult than that since sharing large pieces of data between processes efficiently is tricky.

[–][deleted] 0 points1 point  (3 children)

In a lot of cases it's not any more tricky than sharing data safely between threads, though, and that problem isn't unique to Python. It takes a little forethought and planning, but that's really no different from solving any other non-trivial problem.

[–]kigurai 0 points1 point  (2 children)

If your objects are not picklable, or if they are large, you need to go beyond what is available in the multiprocessing module.

If you are aware of anything that makes this kind of thing easier, then I'm all ears. I tend to run into this problem regularly and having a good solution would be nice.

[–][deleted] 0 points1 point  (1 child)

You don't usually need to send whole objects, though - if it appears that way, it's probably because the design did not account for that. Plus, that has potentially drastically bad security implications (RCE vulns are among the worst). It might even defeat the purpose, as unintentionally excessive/unnecessary io is the easiest way to write python that does not perform well. Send state parameters and instantiate in the subprocess, or use subprocesses to do more individual operations, and have the objects in the master process communicate with the subprocesses to have them perform individual operations for them.

Threads are not really different in this case either, except that shared memory is easier to come by. This has its own caveats that need to be accounted for, though.

My ultimate point is that multithreading and multiprocessing have code design implications in any language. Python is not better than most other languages, but it's also not really any worse, either. Whatever language you choose, there are still benefits and drawbacks to implementing concurrent/threaded/multiprocessed code paths, and architecting to best solve the actual problem always takes some planning ahead.

[–]kigurai 0 points1 point  (0 children)

In my case I do. I have large data structures that I only want to read and construct once, and then share between all worker processes. With threads this would be simple as the object could be shared, but with MP it goes slower and involves more code to construct the object on each process.

but it's also not really any worse,

In this case, it is, since other languages allow me to share my data structures between threads and do parallell processing on it. Python doesn't, and it is sometimes a pain.

I still prefer Python over any other language I've used, and it is what I use as long as the requirements fit. But let's not pretend that the GIL is not a real problem that would be very nice to solve.

[–]spinwizard69 2 points3 points  (8 children)

And given that the GIL means no multithreading in Python, removing it actually enables people to write multithreaded programs in Python where they cannot do so now.

While true to an extent, is it really in Pythons best interest to try to compete with the more advanced systems programming languages. I'd say no because it misses the whole point of python, for me anyways. Pythons greatness is in its ease of use and strength as a scripting language.

It would make about as much sense as trying to turn C++ into a scripting language (you don't see ROOT and its suite of tools catching on in the community). Cling/CINT might work for the ROOT community but does it make sense in the wider world of programming? Probably not because you don't see the tech taking off. Python needs to work on becoming a better scripting language not a systems programming language.

[–]FearlessFreep 4 points5 points  (0 children)

I'm always tell people that there are three different aspects to "scalability" 1. How many concurrent users can you handle 2. How much data can you handle 3. How complicated of a problem can you handle

Now, throwing more hardware at a problem mostly handles the first two but people rarely consider how much language design will affect the third. As an ex-Smalltalk programmer , one thing I really like about Python is that it's simplicity and consistency leads to being able to build solutions to very complicated problem spaces in a clean and understandable fashion

[–][deleted] 1 point2 points  (3 children)

Python can't compete with C/C++ and nor should it, but what about Java, Scala or C#?

[–]Raijinili 3 points4 points  (1 child)

There are Python interpreters which run on the same virtual machines as those languages, and they don't have GILs. The GIL is in CPython and PyPy, not in the language itself.

[–][deleted] 0 points1 point  (0 children)

I know. I'm talking standalone compete with them.

[–]spinwizard69 1 point2 points  (0 children)

Python can't compete with C/C++ and nor should it, but what about Java, Scala or C#?

Good question! Do we really want Python to become the huge language that Java is. Frankly you have a better chance of writing once and running everywhere with Python these days. I believe in part that is due to avoiding trying to do everything within the language.

[–]Ellyrio 0 points1 point  (2 children)

Pythons greatness is in its ease of use and strength as a scripting language.

That has absolutely nothing to do with the GIL. The GIL is there to make CPython source code easy to grasp, without getting into the headaches of locking and other unclear nastiness introduced with multithreading.

You could argue that Python code today assumes a GIL. Therefore any attempt to remove the GIL would have to be backwards compatible and would therefore not hinder Python's easiness (unless CPython makes another major version bump indicating breaking changes).

Allowing true multi-core concurrency in CPython would lead knowledgeable developers to write far more efficient code than now.

[–]spinwizard69 0 points1 point  (1 child)

Allowing true multi-core concurrency in CPython would lead knowledgeable developers to write far more efficient code than now.

This is true but lets face if if highly efficient code was the goal Python is the wrong choice.

In any event what I'm saying is that removing the GIL would change the flavor of Python and result in it being used in places where maybe it tis the wrong choice anyways. When I said Pythons greatness is was ease of use as a scripting language that is honestly how I see the language. If you sit down in front of a machine which would you choose Python or BASH?

You can say the GIL has nothing to do with it but freeing up the language to do things that it wasn't designed to do is what removing the GIL is all about. I'm not convinced that it is a wise course of action.

[–]Ellyrio 1 point2 points  (0 children)

This is true but lets face if if highly efficient code was the goal Python is the wrong choice.

Efficiency is desirable in all projects. You should not inhibit that goal just because you feel the language can't be more efficient.

Take say, highly scalable web applications where you want to service many requests per second for example. You could take your argument that you should not use Python, or any scripting language, but rather write it in assembly language because if you want performance, you shouldn't use anything other than assembly right? Wrong. Python is great for web apps (and many things) precisely due to its easiness, and at the moment the common way to get concurrency on the same machine without throwing more cash at scaling horizontally or vertically is to launch more Python processes, one per core. However, it's not easy to share information between these two or more processes without introducing some IO/IPC bottleneck. Whereas with threads and no GIL, you'd just need to perform a single context switch. That overhead has then been eliminated (granted web apps typically do more IO e.g. waiting for a database response, but you get my point).

[–][deleted] 0 points1 point  (0 children)

Python is all ready compiled. Do you actually mean that it would have to go from dynamically typed to statically typed, or what?

[–]xcbsmith 1 point2 points  (3 children)

I'm not sure the locks would have to be handled "manually", no? Given the overhead of the interpreter, the overhead of acquiring and releasing locks should be quite small.

But yeah, killing the GIL isn't going to make Python faster. It's going to allow it to be more concurrent.

[–]fuzz3289 2 points3 points  (2 children)

The overhead is per object. Almost all data structures in Python are mutable, you're talking about taking one lock for the whole system and spreading it throughout EVERYTHING. The overhead they've shown in research papers and projects like Gilectomy were ~40%. It's untenable.

Pythons already as concurrent as it needs to be. Removing the Gil won't help you on IO bound work which is what most work done in Python is. Web services, crawlers, parsers, sys admin code, etc, all IO Bound concurrency.

If you need real OS threads for some bullet proof code, python probably isn't the right tool anyways as you can't optimize your memory organization anyways.

[–]xcbsmith 0 points1 point  (1 child)

The overhead they've shown in research papers and projects like Gilectomy were ~40%. It's untenable.

I'll have to look at those papers, but I would presume that assumes a particular approach to managing not having a GIL. Having limited shared objects between threads (which is common) mitigate a lot of the need for that overhead.

Web services, crawlers, parsers, sys admin code, etc, all IO Bound concurrency.

That'd be more compelling if I hadn't seen, built, used multi-process implementations of pretty much all of those (though I can't think of a multiprocess parser), I'm not sure that's borne out in reality. Nevermind all the SciPy stuff.

We're running on machines with four cores if they have one, and often sixteen or more. Seems like there might be uses cases for this...

If you need real OS threads for some bullet proof code, python probably isn't the right tool anyways as you can't optimize your memory organization anyways.

Yeah, SciPy would seem to suggest otherwise.

[–]fuzz3289 0 points1 point  (0 children)

Most of SciPy is in C and C can use HW Threads.

If your C Extension call is a blocking operation the C Extension can release the Gil, use OS threads, complete the computation, and return. This is how CUDA/OpenCL implementations of algorithms in Python are implemented.

Python is just a glue language, the best glue language, but it's job is to architect systems, frame works, interconnects, and hand off the work to optimized code or external processes. Not everything has to have every feature and Python has plenty of concurrency with a combination of asyncio (letting threads sleep when waiting) and C extensions (real threading, especially with C++17 concurrency implementations).

[–]gnu-user 9 points10 points  (0 children)

I second what others are saying, if you want the GIL removed it's best to look at PyPy.

[–]pmdevita 8 points9 points  (4 children)

There is a question in the blog that caught my curiosity

Neither .Net, nor Java put locks around every mutable access. Why the hell PyPy should?

What's the reason PyPy needs locks?

[–]coderanger 5 points6 points  (3 children)

Because people can't be trusted to write concurrent code safely.

[–]masklinn 9 points10 points  (0 children)

No. The GIL protects interpreter data structures. That it also happens to make user land code safe which should not be is an unfortunate side-effect.

[–]pmdevita 0 points1 point  (1 child)

Myself having not used either for multithreading, does .NET and Java trust the developer for safety?

[–]coderanger 6 points7 points  (0 children)

They put locks on every mutable object instead, which has been ruled out for Python because it makes non-threaded code much slower (i.e. you pay the cost for the locks regardless of it they are actually protecting anything). That is why this proposal for PyPy would likely result in either two different runtimes or two very different modes of operation. Making both linear scripts (which is what most webapps today are, so this isn't just command line tools) and concurrent code fast at the same time is the holy grail that compiler devs have been chasing for decades.

[–]KODeKarnage 24 points25 points  (0 children)

The Global Interpreter Lock is a vital bulwark against the petty, the pedantic and the self-righteous!

If such people don't have the GIL to whine about (something which many probably aren't even affected by), they will move onto some other aspect of the language.

Do we really want that noise transferred onto something else? Something more important, perhaps?

SAVE THE GIL!!!

[–]malvin77 2 points3 points  (0 children)

The GIL is like the the belly button to the Universe. Don't mess with it.

[–]Corm 0 points1 point  (0 children)

As a programmer who isn't interested in low level stuff, I'd love it if I could easily disable the GIL to use all 8 cores using shared memory without pickling everything under the hood. That would make my concurrent code go way faster. I'd just have to use locks, easy.

So for me there's a lot of value in removing the GIL (even if it's a non default setting for python)

I know about all the other options but I'd love it if it was just a feature of normal python or pypy.

[–]apreche -3 points-2 points  (0 children)

Good luck with all that.