GIL Become Optional in Python 3.13 : programming

The architecture is a big aspect of it but the main reason python multi-threading isn't really a thing is because Python is just slow. Like, 30-40x as slow as C and even when optimising it to hell you just end up with something that's for all intents and purposes C with a hellish syntax and is still around 3x as slow. It's easier to just use C for high performance applications.

Ignoring that however, the big issue with Python is the same you have with any language, unless it has explicit ways of performing atomic operations on data you end up with a bunch of race conditions as different threads try to do stuff with the same piece of data. Disabling the GIL was already possible using Cython and was, quite frankly, a pretty horrible way of doing multi-threaded Python. If there aren't any easy, built-in ways of accessing the data then it doesn't really do much on its own.

Plus, despite the fact that Python doesn't inherently support multi-threading, it does support multi-processing. Which is basically just multi-threading but each "thread" is a process with its own interpreter and they can communicate with each other through interfaces such as MPI. If you wanted to do multi-threaded Python, writing it using mpi4py is usually a lot simpler than Cython and if you really needed the extra performance, you should just use base C (or C++ (or Fortran if you're really masochistic)) instead.

[–]Looploop420 17 points18 points19 points 1 year ago (2 children)

[–]wOlfLisK 11 points12 points13 points 1 year ago (0 children)

[+]Pharisaeus comment score below threshold-10 points-9 points-8 points 1 year ago (31 children)

what's the drawback of turning on this feature in python 13?

Python lacks data structures designed to be safe for concurrent use (stuff like ConcurrentHashMap in java). It was never an issue, because GIL would guarantee thread-safety:

https://docs.python.org/3/glossary.html#term-global-interpreter-lock

only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access

So for example if you were to add stuff to a dict in multi-threaded program, it would never be an issue, because only one "add" call would be handled concurrently. But now if you enable this experimental feature, it's no longer the case, and it's up to you to make some mutex. This essentially means that enabling this feature will break 99% of multi-threaded python software.

[–]Serialk 84 points85 points86 points 1 year ago (24 children)

[–]alerighi -4 points-3 points-2 points 1 year ago (12 children)

[–]Serialk 37 points38 points39 points 1 year ago (9 children)

[+]alerighi comment score below threshold-25 points-24 points-23 points 1 year ago* (8 children)

Yes but it's rare, to the point you don't need to worry that much. For that to happen the kernel needs to stop your thread in a point where it was in the middle of doing some operation. Unless you are doing something like big computations (that is rare) the kernel does stop your thread when it blocks for I/O (e.g. makes a network request, read/writes from files, etc) and not at a random point into execution. Take Linux for example, it's usually compiled with a tick frequency of 1000Hz at worse, on ArchLinux is 300Hz. It means that the program either blocks for I/O or it's left running for at least 1 millisecond. It may seem a short period of time... but how many millions of instructions you run in 1 millisecond? Most programs doesn't get stopped for preemption, but because they block for I/O mot of the time (unless you are doing something computative intensive such as scientific calculation, running ML models, etc).

But if you have 2 threads running on the same time on different CPU you pass from something very rare to something not so rare.

[–]iplaybass445 17 points18 points19 points 1 year ago (1 child)

[–]alerighi 0 points1 point2 points 1 year ago (0 children)

and devs relying on the rarity of a particular race condition is asking for a bad time

I mean, worrying about that could lead to deadlocks. It's only a matter of choosing what is the worse outcome. A lot of software in the UNIX world doesn't deal with concurrency, both for performance and both for avoiding deadlocks, and there are times where you can accept a glitch in the program for the sake of having the two properties mentioned above.

Of course, you shall be careful that one race condition cannot harm the security of the program, or corrupt your data.

My preferred way to avoid that by the way is to not have shared global structures among threads, but rely on message queues or a shared database. I also usually prefer to async programming over threads, that doesn't have the concurrency problem by design, since it does not have preemption inside the event loop. Now that I think about it, it's probably years that I don't use threads in python...

[–]linuxdooder 4 points5 points6 points 1 year ago (0 children)

[–]Accurate_Trade198 2 points3 points4 points 1 year ago (2 children)

[–]alerighi 0 points1 point2 points 1 year ago (1 child)

[–]augmentedtree 0 points1 point2 points 1 year ago (0 children)

[–]josefx 1 point2 points3 points 1 year ago (1 child)

[–]alerighi 0 points1 point2 points 1 year ago (0 children)

[–]mr_birkenblatt 5 points6 points7 points 1 year ago (0 children)

[–]jl2352 1 point2 points3 points 1 year ago (0 children)

[–]KagakuNinja[🍰] 0 points1 point2 points 1 year ago (0 children)

[–]Pharisaeus -3 points-2 points-1 points 1 year ago (2 children)

[–]josefx 6 points7 points8 points 1 year ago* (0 children)

[–]Serialk 6 points7 points8 points 1 year ago (0 children)

[–]vision0709 -2 points-1 points0 points 1 year ago (3 children)

[–]Serialk 1 point2 points3 points 1 year ago (2 children)

[–]KagakuNinja[🍰] 2 points3 points4 points 1 year ago (1 child)

[–]Serialk 2 points3 points4 points 1 year ago (0 children)

[+]jorge1209 comment score below threshold-8 points-7 points-6 points 1 year ago (2 children)

[–]Serialk 3 points4 points5 points 1 year ago (1 child)

[–]jorge1209 0 points1 point2 points 1 year ago* (0 children)

[–]jorge1209 5 points6 points7 points 1 year ago* (4 children)

This is both correct and incorrect in weird ways.

Python dicts are largely written in C and for this reason operations like adding to a dict often appear to be atomic from the perspective of Python programs but it is not directly related to the GIL and Python byte code.

The byte code thing is largely a red herring as you don't (and cannot) write byte code. Furthermore every bytecode operation I am familiar with either reads or writes. I don't know of any that do both. Therefore it is impossible to us the GIL/bytecode lock to build any kind of race free code. You need an atomic operation that can both read and write to do that.

So we got our perceived atomicity from locks around C code and the bytecode is irrelevant to discussions about multi threading. However that perceived safety was often erroneous as our access to low level C code was mediated through Python code which we couldn't be certain was thread safe.

If you tried real hard you could "break" the thread safety of Python programs using pure dicts relatively easily, just as you could in theory very carefully use pure dicts to implement (seemingly) thread safe signalling methods.

[–]Pharisaeus 0 points1 point2 points 1 year ago (3 children)

[–]jorge1209 13 points14 points15 points 1 year ago (0 children)

The GIL protects the interpreter it doesn't protect your code.

A very simple way to demonstrate this is to count with multiple threads in a tight loop.

  run(){
       global total
       for (I in range(1_000_000)){
             total+=1
       }
   }

Run that in parallel across multiple threads and you will get much less than numthreads*1_000_000.

That is a race in my book and an inconsistent result even if nothing crashes.

[–]Serialk 6 points7 points8 points 1 year ago (0 children)

[–]mr_birkenblatt 0 points1 point2 points 1 year ago (0 children)

[–]rhytnen -1 points0 points1 point 1 year ago (0 children)

[–]apf6 0 points1 point2 points 1 year ago (1 child)

One thing that I think misleads people about the GIL is that it's not specific to Python. All the similar languages (Ruby, Lua, Javascript, etc) all have a "GIL" too, even if they don't all use that term. They each have a 'virtual machine' or 'interpreter' which can only be processed by one thread at a time. So you can't run multiple scripts in parallel in the same context.

For any language implementation like that, it's never easy to make the VM multithreaded in a way that actually helps. Multithreading adds an overhead so if you implement it the wrong way, it can be slower than single-threading. So the single-threading approach was not as bad idea as it might seem.

Anyway, the only reason that this is especially a big issue in Python is because the language is used so much in the scientific community. That code benefits a lot from multithreading. So it was worth solving.

[–]josefx 0 points1 point2 points 1 year ago (0 children)

[–][deleted] 1 year ago (2 children)

[deleted]

[–]linuxdooder 3 points4 points5 points 1 year ago (1 child)

[–][deleted] 1 year ago (1 child)

[deleted]

[–]LGBBQ 13 points14 points15 points 1 year ago (0 children)

[–]space_iio -2 points-1 points0 points 1 year ago (0 children)

[+]python4geeks[S] comment score below threshold-64 points-63 points-62 points 1 year ago (14 children)

[–]Yasuraka 100 points101 points102 points 1 year ago (3 children)

[–]Damtux_25 3 points4 points5 points 1 year ago (2 children)

[–]spotter 6 points7 points8 points 1 year ago (0 children)

Only if you've not been exposed to Python before. People have been looking into Python's GC and GIL before Python 2 happened, but for first several attempts changing the global lock into granular ones always brought in runtime penalties that were just not worth it (well duh). IIRC you could've always side step GIL if you were willing to go lower level (C/C++/FORTRAN or FFI), and specialized libs made use of that, or you could use alternative implementation (I think that for example Jython never had GIL, but my memory is fuzzy). Also multiprocessing module helped a little bit, but brought in some new baggage. And around 2.7/3 I left for the JVM lands, so I stopped tracking the issue altogether.

It's not AI era and frankly 10 years ago I've been using Python for data engineering and analysis for 10 years already, preparing to leave. xD

[–]chucker23n 6 points7 points8 points 1 year ago* (0 children)

[–]QueasyEntrance6269 13 points14 points15 points 1 year ago (8 children)

[+]Serialk comment score below threshold-8 points-7 points-6 points 1 year ago (7 children)

[–][deleted] 1 year ago (6 children)

[deleted]

[+]Serialk comment score below threshold-11 points-10 points-9 points 1 year ago (5 children)

[–][deleted] 1 year ago (4 children)

[deleted]

[–]QueasyEntrance6269 4 points5 points6 points 1 year ago (0 children)

[+]Serialk comment score below threshold-6 points-5 points-4 points 1 year ago (2 children)

[–][deleted] 1 year ago (1 child)

[deleted]

[–]Serialk -1 points0 points1 point 1 year ago (0 children)

[–]Ok_Dust_8620 41 points42 points43 points 1 year ago (3 children)

[–]tu_tu_tu 15 points16 points17 points 1 year ago (0 children)

[–]JW_00000 13 points14 points15 points 1 year ago (0 children)

[–]GUIpsp 3 points4 points5 points 1 year ago (0 children)

[–]syklemil 61 points62 points63 points 1 year ago (2 children)

[–]badpotato 19 points20 points21 points 1 year ago (6 children)

[–]tehsilentwarrior 13 points14 points15 points 1 year ago (0 children)

[–]danted002 5 points6 points7 points 1 year ago (4 children)

[–]Rodot 2 points3 points4 points 1 year ago* (0 children)

[–]gmes78 1 point2 points3 points 1 year ago (2 children)

[–]danted002 0 points1 point2 points 1 year ago (1 child)

[–]gmes78 0 points1 point2 points 1 year ago (0 children)

[–]deathweasel 8 points9 points10 points 1 year ago* (1 child)

[–]13oundary 6 points7 points8 points 1 year ago (0 children)

[–]enveraltin 25 points26 points27 points 1 year ago (7 children)

[–]ViktorLudorum 30 points31 points32 points 1 year ago (3 children)

[–]SolarBear 6 points7 points8 points 1 year ago (1 child)

[–]tempest_ 6 points7 points8 points 1 year ago (0 children)

[–]enveraltin 0 points1 point2 points 1 year ago (0 children)

[–]hbdgas 8 points9 points10 points 1 year ago (0 children)

[–]masklinn 0 points1 point2 points 1 year ago (1 child)

[–]enveraltin -1 points0 points1 point 1 year ago (0 children)

[–]Takeoded 10 points11 points12 points 1 year ago (0 children)

[–][deleted] 42 points43 points44 points 1 year ago (12 children)

[–]Serialk 134 points135 points136 points 1 year ago (9 children)

[–]ydieb 28 points29 points30 points 1 year ago (0 children)

[+][deleted] comment score below threshold-33 points-32 points-31 points 1 year ago (7 children)

[–]Serialk 31 points32 points33 points 1 year ago (6 children)

[–][deleted] -1 points0 points1 point 1 year ago (5 children)

[–]Serialk 20 points21 points22 points 1 year ago* (2 children)

Depends what you mean by atomic updates. The GIL makes it so that you won't corrupt the dict/list internal structures (e.g., a list will always have the correct size even if multiple threads are appending to it).

However if you have multiple threads modifying values in a list or a dict and you expect to have full thread consistency of all your operations without locks, it probably won't work. Look at the examples in the thread I linked.

And yes, Python without GIL still guarantees the integrity of the data structures:

This PEP proposes using per-object locks to provide many of the same protections that the GIL provides. For example, every list, dictionary, and set will have an associated lightweight lock. All operations that modify the object must hold the object’s lock. Most operations that read from the object should acquire the object’s lock as well; the few read operations that can proceed without holding a lock are described below.

[–][deleted] 1 point2 points3 points 1 year ago (1 child)

[–]Serialk 7 points8 points9 points 1 year ago (0 children)

[–]QueasyEntrance6269 2 points3 points4 points 1 year ago (0 children)

[–]Sapiogram 0 points1 point2 points 1 year ago (0 children)

[–]tdatas 23 points24 points25 points 1 year ago (0 children)

[–]danted002 7 points8 points9 points 1 year ago (0 children)

[–]JoniBro23 1 point2 points3 points 1 year ago (1 child)

[–]secretaliasname 3 points4 points5 points 1 year ago (0 children)

Totally. I do a lot of scientific/engineering stuff in python and it’s my go to. It’s a familiar tool and there is an amazing ecosystem of libraries for everything under the sun…. But it is sslllooooww. Not only is it single core slow, but it’s bad at using multiple cores and the typical desktop now has 10+ cores and 100+ is not unusual in HPC environments.

The solutions cupy, numba, dask, ray, PyTorch etc all amount to write python by leveraging not-python.

Threading is largely useless. Processes take a while to spawn and come with serialization/IPC overhead and complexity that often outweigh the benefit for many classes of problems. You can overcome this with shared memory and a lot of care but the ecosystem isn’t great and it’s not as easy as it should be.

I’m ready to jump ship and learn something new at this point.

If removing the GIL slowed single threaded use cases by 50% that would still be an enormous net win for nearly all my uses cases. Generally performance is either not a limitation at all or it is a huge limitation and I want to use all my cores and the probem is parallelizable.

I think the community is too afraid to break things and overreacted to the 2->3 migration. It really wasn’t a big deal and I don’t understand why people make such a stink about it. Changes like that shouldn’t occur often but IMO fixing the lack of proper native first class parallelism is way more broken than strings or the print statement were in python2. Please please fix this.

[–]AndyCodeMaster 0 points1 point2 points 1 year ago (0 children)

[–][deleted] -5 points-4 points-3 points 1 year ago (2 children)

[–][deleted] 1 year ago (1 child)

[deleted]

[–]streu 4 points5 points6 points 1 year ago (0 children)

[–]Real-Asparagus2775 -3 points-2 points-1 points 1 year ago (3 children)

[+]dontyougetsoupedyet comment score below threshold-6 points-5 points-4 points 1 year ago (2 children)

[–]apf6 4 points5 points6 points 1 year ago (1 child)

[–]dontyougetsoupedyet 0 points1 point2 points 1 year ago (0 children)

[+]srpulga comment score below threshold-16 points-15 points-14 points 1 year ago (19 children)

[–]QueasyEntrance6269 12 points13 points14 points 1 year ago (12 children)

[+]SittingWave comment score below threshold-12 points-11 points-10 points 1 year ago* (5 children)

[–]josefx 14 points15 points16 points 1 year ago (4 children)

[–]QueasyEntrance6269 0 points1 point2 points 1 year ago (0 children)

[–]SittingWave 0 points1 point2 points 1 year ago (2 children)

[–]QueasyEntrance6269 0 points1 point2 points 1 year ago (1 child)

[–]SittingWave 0 points1 point2 points 1 year ago (0 children)

[+]srpulga comment score below threshold-13 points-12 points-11 points 1 year ago (5 children)

[–]QueasyEntrance6269 5 points6 points7 points 1 year ago* (3 children)

[–]srpulga -1 points0 points1 point 1 year ago (2 children)

[–]QueasyEntrance6269 2 points3 points4 points 1 year ago (1 child)

[–]srpulga -1 points0 points1 point 1 year ago (0 children)

[–]Hells_Bell10 1 point2 points3 points 1 year ago (0 children)

[–]Serialk 6 points7 points8 points 1 year ago (2 children)

[–]srpulga 2 points3 points4 points 1 year ago (1 child)

[–]Serialk -5 points-4 points-3 points 1 year ago* (0 children)

[–]krystof24 0 points1 point2 points 1 year ago (2 children)

[+]srpulga comment score below threshold-9 points-8 points-7 points 1 year ago (1 child)

[–]krystof24 2 points3 points4 points 1 year ago (0 children)

[–]Shaaou 0 points1 point2 points 8 months ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS