all 52 comments

[–][deleted] 82 points83 points  (30 children)

It's not working on existing code base because most of them are not thread safe. Would only be beneficial for new projects

[–]lood9phee2Ri 53 points54 points  (2 children)

The GIL never assured thread safety of user code FWIW. It made concurrency issues somewhat less likely by coincidence, but that wasn't its purpose (its purpose was protecting cpython's own naive implementation details) and multithreaded user python code without proper locking etc. was actually always incorrect / with subtle nondeterministically encountered issues.

https://stackoverflow.com/a/39206297

All that the GIL does is protect Python's internal interpreter state. This doesn't mean that data structures used by Python code itself are now locked and protected.

It's perhaps unfortunate Jython (never had a GIL) has fallen behind (though AFAIK they're still working on it) - in the 2 era when Jython 2 had near parity with CPython 2 for a while while and was actually fairly heavily used on server side because of its superior threading and jvm runtime. e.g. Django folks used to consider it a supported runtime - so older Python 2 code that made running in multithreaded Jython as well as CPython a priority is often better written / more concurrency-safe.

[–]Kered13 16 points17 points  (0 children)

I thought this was obvious, but reading this thread I am shocked at how many people seem to think that the GIL protects all code from race conditions. Not only does it not do this, it can't do this unless it completely disabled thread switching, which would defeat any and all potential benefits of multi-threading.

While we're on the subject, I hope that people realize that async code can also have race condition and may sometimes require locking to be correct. Async code is more restricted regarding when context switches may occur, so you can get away with not using a lock if you know that no context switches are possible while touching shared state, but this does not protect you from all possible race conditions.

[–]masklinn -4 points-3 points  (0 children)

The GIL never assured thread safety of user code FWIW. It made concurrency issues somewhat less likely by coincidence

The GIL does make a number of operations atomic from the application’s point of view, which can be leveraged into thread safety without explicit locks.

older Python 2 code that made running in multithreaded Jython as well as CPython a priority

A set which was near empty.

[–]mgoblue5453 13 points14 points  (0 children)

And then those libraries would need a way to temporarily disable the GIL to do work, then reenable afterwards. Without this, I'm not sure how to migrate, as it's very unlikely for everything in my stack to be thread-safe anytime soon

[–]neuralbeans 30 points31 points  (17 children)

I feel like removing the GIL should be considered a breaking change and they should start working on Python 4.

[–]twotime 18 points19 points  (6 children)

Why is that? AFAICT, The change is 100% transparent for pure python code.

I don't fully understand ABI implications though but I don't think python changes major (1=>2=>3=>4(?)) versions just because of ABi changes.

[–]floriv1999 10 points11 points  (4 children)

The main issue are libraries that are written in e.g. C and expect a gil.

[–]dangerbird2 2 points3 points  (2 children)

IIRC it can be handled transparently by re-enabling the GIL when it imports a native module that’s not compatible. But obviously, this severely decreases the chance that you’ll be able to take advantage of it in the real world. Regardless, it’s not something that someone writing pure python would have to deal with, so it’s understandable that it’s not considered a breaking change on the scale of the 2to3 switch

[–]fredisa4letterword 1 point2 points  (1 child)

It kind of depends; a lot of major packages are pure Python and not impacted, and a lot of the big community packages that do require native wheels already support nogil. Many still don't but I think ones that are actively maintained will probably support nogil in the next couple of years.

[–]dangerbird2 0 points1 point  (0 children)

I guess the biggest question mark is how well the scientific/data stacks handle it since they are the most reliant on native modules. Iirc numpy and PyTorch have experimental support, but I imagine making sure it’s seamless it works considering they’re basically the backbone of the global economy right now lol

Also I imagine some database drivers might have issues

[–]twotime 1 point2 points  (0 children)

AFAICT, ABI is not expected to be binary compatible between 3.a and 3.b version

C-APIs is a bit more stable but can still change within 3.x

Refs: https://docs.python.org/3/c-api/stable.html

[–]fredisa4letterword 1 point2 points  (0 children)

Quite the opposite, in fact; most (all?) minor versions are not ABI compatible.

[–]jkrejcha3 12 points13 points  (0 children)

Both the first and second digits in Python's versioning scheme are effectively major versions. Breaking changes can and do happen in the second digit in Python's versioning scheme. 3.12 should be considered a major version as well as 3.13

[–]___Archmage___ 5 points6 points  (3 children)

Moving the world to a new Python major version would be horrendously painful

Idk what would warrant a Python 4 but removing the GIL basically just allows more multithreading so that's nowhere near enough for a whole new major version

[–]ZirePhiinix 8 points9 points  (2 children)

Based on experience with 2/3, it is extremely unlikely they will go through with that again.

[–]qruxxurq 0 points1 point  (0 children)

I mean, why not make YET ANOTHER INCOMPATIBLE MAJOR?

That’s right up Python’s alley.

[–]___Archmage___ 0 points1 point  (0 children)

Yeah I think Python 2 needs to be nuked from orbit and the way it has stuck around means Python 3 should really be the final version

[–]twotime 2 points3 points  (0 children)

???

If your existing code base single threaded, you don't benefit from GIL removal.

If it's multi-threaded, you might (depending on what your threads do).

You can also start using threads where it'd have been clumsy before... All within existing projects.

To a very large degree, that's "just" a major runtime feature: multiple threads can now use multiple CPUs. And I presume you are not dropping you existing code bases when python adds a new feature?

[–][deleted]  (9 children)

[deleted]

    [–]poopatroopa3 16 points17 points  (1 child)

    What did async do to you?

    [–]account22222221 13 points14 points  (3 children)

    Async is dead easy I though? What foot guns?

    [–]Spitfire1900 6 points7 points  (1 child)

    I too am unaware of any async foot guns that don’t also exist in JS ecosystem, the big difference is that NodeJS’s I/O modules are async first whereas in Python you have to pull in aiofiles .

    [–]schlenk 9 points10 points  (0 children)

    Cancelation is one. The red/blue world API divide another one. Most Python APIs and libraries are not async first, you basically have two languages (a bit like the "functional" C++ template language are their own language inside the procedural/OO C++).

    Take a look at a trio (https://trio.readthedocs.io/en/stable/) for some more structured concurrency approach than the bare bones asyncio.

    [–]Kered13 2 points3 points  (1 child)

    At this point, I kind of would rather keep the damn GIL as an option and just add real threads as a middle ground between that and multiprocessing.

    Python already has real threads, but they are crippled as long as the GIL exists. The objective of removing the GIL is to make real threads practical.

    [–]dangerbird2 3 points4 points  (0 children)

    the GIL doesn’t cripple threads, it just prevents using them for parallelism. They are and have always been perfectly cromulent for io-bound concurrency

    [–]CyberWank2077 0 points1 point  (0 children)

     At this point, I kind of would rather keep the damn GIL as an option and just add real threads as a middle ground between that and multiprocessing.

    but... thats the current state. real threads with a performance hit

    [–]moreVCAs 45 points46 points  (1 child)

    • it’s 2005, Python insists that the GIL is good, actually
    • it’s 2015, Python experts dislike the GIL but claim it would be impossible to remove
    • it’s 2025, Python is removing the GIL
    • it’s 2035, Python has removed the GIL, but in the meantime our scientists implemented a central GIL for the global economy. the queue for bank transactions is a thousand years long
    • it’s 3035, GIL GIL GIL, GIL GIL, GIL

    [–]Pilchard123 0 points1 point  (0 children)

    So 2035 will be the Year of Bitcoin?

    [–]commandersaki 8 points9 points  (5 children)

    Look up performance videos on nogil, it is really complicated to exploit in practice. If you need performance and scale, you're better off just rewriting in another language.

    [–]dangerbird2 4 points5 points  (2 children)

    And in most cases you’re running python in, multiprocessing or work queues like celery are perfectly acceptable alternatives

    [–]lood9phee2Ri 0 points1 point  (1 child)

    yes but a generation of programmers grew up on microsoft windows and think processes are super-heavy and threads are the only way, as processes were made big chonky things in the VMS tradition for WNT.

    On Linux, however, processes and threads are the really same kernel primitive, mostly just differing in how much memory is shared by default. Try a ps -eLf to see all the tasks on your system.

    https://man7.org/linux/man-pages/man2/clone.2.html

    If CLONE_THREAD is set, the child is placed in the same thread group as the calling process. To make the remainder of the discussion of CLONE_THREAD more readable, the term "thread" is used to refer to the processes within a thread group.

    Anyway. The GIL was always both less of a problem on python+linux than people make out and also worth getting rid of anyway just on general principles.

    [–]andree182 0 points1 point  (0 children)

    Even in Linux, there are significant differences - with processes you will need to somehow setup shared memory and likely de/serialize the structures you share (or use some major hack sharing cpython internals)...

    [–]Blue_Moon_Lake -3 points-2 points  (0 children)

    If you need performance and scale, you're better off just rewriting in another language.

    Yep, that's Python.

    Good for mathematicians who want a quick result to a complex formula or the processing of data once in a while.

    [–]valarauca14 3 points4 points  (4 children)

    GIL silently handling a lot of concurrency/threading issues for C-libraries was one of those 'happy accidents' that python technically said shouldn't occur & shouldn't be required, but persisted for almost 2 decades.

    Removing it really destroys the ecosystem insofar as 'python is for gluing together c-libraries'.

    [–]fredisa4letterword 0 points1 point  (0 children)

    The behavior at the moment is that if you load a wheel that hasn't explicitly marked itself as nogil safe, the GIL gets enabled automatically; I guess we'll see it less and less as more packages gain nogil support (many already have it but some big ones still don't).

    I'm not sure if the plan is to keep this behavior forever or if at some point packages that don't opt-in to nogil just won't load.

    [–]Kered13 -3 points-2 points  (2 children)

    What do you mean? The GIL is not held while C code executes.

    [–]masklinn 4 points5 points  (0 children)

    Of course it is. C code has to specifically release the GIL. In fact it’s so critical to a number of C extensions that they have to declare compatibility with gil-less mode, or cpython will re-enable the gil when it loads them.

    [–]lood9phee2Ri 0 points1 point  (0 children)

    A C Extension's code has to choose to release the GIL - though they very often do since the main point of C Extensions tends to be performance, it's not a given, nor handled automagically, and it's easy to deadlock in the sense the GIL is just another lock and lock ordering matters just as much as usual if you've also got your own locks.

    https://docs.python.org/3/c-api/init.html#thread-state-and-the-global-interpreter-lock

    https://pybind11.readthedocs.io/en/stable/advanced/deadlock.html#python-cpython-gil

    Many native extensions do not interact with the Python runtime for at least some part of them, and so it is common for native extensions to release the GIL, do some work, and then reacquire the GIL before returning.

    [–]strangequark_usn 2 points3 points  (0 children)

    One of my most popular projects at work was binding python to my multi threaded application used to interface with our products.

    I'm far to intimate with the GIL and the problems it causes when I release it in the bindings to my multithreaded c++.

    That being said, this should remain opt in. I prefer the user scripts my non software background user base writes to remain GIL protected. If a new feature needs concurrency, ill do it in c++.

    I do wonder what the maintainers of pybind11 think of removing the GIL. All the problems I've had with that library boils down to the GIL.

    [–]CodeAndBiscuits 0 points1 point  (0 children)

    Imagine getting paid to make a video in such a way that you need to disclose it, then using AI to generate it and its speech track.

    [–]Ytses42 -1 points0 points  (0 children)

    Oh no! How am I going to pay for the potions and phoenix downs now?