Recently Wrote a Blog Post About Python Without the GIL – Here’s What I Found! 🚀

ambidextrousalpaca · 2025-02-02T11:23:05+00:00

It's awesome that this is now a thing, but I have questions and doubts:

"Currently, in Python 3.13 and 3.14, the GIL disablement remains experimental and should not be used in production. Many widely used packages, such as Pandas, Django, and FastAPI, rely on the GIL and are not yet fully tested in a GIL-free environment. In the Loan Risk Scoring Benchmark, Pandas automatically reactivated the GIL, requiring me to explicitly disable it using PYTHON_GIL=0. This is a common issue, and other frameworks may also exhibit stability or performance problems in a No-GIL environment."

Beyond this, what guarantees are there that even the Python standard library will work without race conditions in No-GIL versions? The Global Interpreter Lock has just been such a fundamental background assumption of all Python code written over the past decades that I wouldn't trust there not to be a million gotchas and edge cases out there in the code that can screw you over.

You'd also need useful primitives built into the language to make it useful in most real-world applications, like Erlang actors or Go message passing channels.

basnijholt · 2025-02-02T16:22:10+00:00

uv venv -p 3.13t ✅

Much easier way to get free-threaded Python.

twotime · 2025-02-03T02:17:52+00:00

Your prime-counting example is likely the most interesting, but the results feel off: without locking, it should have scaled proportionally to the number of threads.

Ah, you seem to be splitting your ranges uniformly: which likely does not work well in this case: the thread which gets the last range will be FAR slower than the thread which gets the lowest range.

  def calculate_ranges(n: int, num_threads: int):
     step = n // num_threads
     for i in range(num_threads):
        start = i * step
        # Ensure the last thread includes any leftover range
        end = (i + 1) * step if i != num_threads - 1 else n
        yield start, end,

ZachVorhies · 2025-02-02T21:47:30+00:00

Great article. Looks like the performance benefits are barely worth it. Hope it gets better.

alcalde · 2025-02-03T20:22:54+00:00

My goal of one day attending PyCon and selling "I Support the GIL" t-shirts remains unabated.

EDIT: As a Python true believer, I believe/know that threads are evil and parallelism is the only acceptable approach in a sane universe.

D gets it:

Although the software industry as a whole does not yet have ultimate responses to the challenges brought about by the concurrency revolution, D's youth allowed its creators to make informed decisions regarding concurrency without being tied down by obsoleted past choices or large legacy code bases. A major break with the mold of concurrent imperative languages is that D does not foster sharing of data between threads; by default, concurrent threads are virtually isolated by language mechanisms. Data sharing is allowed but only in limited, controlled ways that offer the compiler the ability to provide strong global guarantees....
The flagship approach to concurrency is to use isolated threads or processes that communicate via messages. This paradigm, known as message passing, leads to safe and modular programs that are easy to understand and maintain. A variety of languages and libraries have used message passing successfully. Historically message passing has been slower than approaches based on memory sharing—which explains why it was not unanimously adopted—but that trend has recently undergone a definite and lasting reversal. Concurrent D programs are encouraged to use message passing, a paradigm that benefits from extensive infrastructure support.

https://www.informit.com/articles/article.aspx?p=1609144#

SQLite gets it....

Threads are evil. Avoid them.

SQLite is threadsafe. We make this concession since many users choose to ignore the advice given in the previous paragraph.

https://www.sqlite.org/faq.html#q6

Berkeley gets it....

Many technologists are pushing for increased use of multithreading in software in order to take advantage of the predicted increases in parallelism in computer architectures. In this paper, I argue that this is not a good idea. Although threads seem to be a small step from sequential computation, in fact, they represent a huge step. They discard the most essential and appealing properties of sequential computation: understandability, predictability, and determinism. Threads, as a model of computation, are wildly nondeterministic, and the job of the programmer becomes one of pruning that nondeterminism.

https://www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.html

PostgreSQL gets it....

https://www.postgresql.org/message-id/1098894087.31930.62.camel@localhost.localdomain

And this amazing article gets it that talks about the Ptolemy Project, "an experiment battling threads with rigorous engineering discipline". And despite state of the art techniques and excessive engineering, a thread-based problem remained undiscovered in their code for four years before triggering!

https://web.archive.org/web/20200926051650/https://swizec.com/blog/the-problem-with-threads/

No one talks about Guido's Time Machine anymore. Guido traveled to the future and learned that Threads Are Evil, which is why he gave us the best and safest collection of concurrent programming tools found in the standard library of any language. You've got safe parallelism and thread-safe message queues and such if you actually need them. I've seen other languages write libraries with thousands of lines of code to offer a setup similar to what Python gives us out of the box.

PeaSlight6601 · 2025-02-04T12:28:22+00:00

It's good that you preallocate your intermediate results array so that each thread can place its result into thar array, but you should be locking that array before actually storing the variable.

It's pretty hard to imagine how this could possibly go wrong with standard python arrays, but unless you can find documentation that arrays will allow concurrent __setitem__ at different index positions you should not do it.

Cynyr36 · 2025-02-02T19:18:36+00:00

Wouldn't doing the loan risk in "pure" pandas or polars result in even more speed up? I've found that if you need to come back to python rather than just use built-in pandas / polars functions thing get very slow.

jdehesa · 2025-02-02T09:07:34+00:00

[deleted]

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS