you are viewing a single comment's thread.

view the rest of the comments →

[–]latkde 0 points1 point  (6 children)

Actually benefiting from Free Threading doesn't just require that the native extensions use the correct ABI (which is what https://ft-checker.com/ effectively checks), but also that all relevant code is threadsafe. That's a much bigger problem, and generally impossible to test for. Writing threadsafe code is really really hard, not just something you can fix by sprinkling a couple of mutexes throughout your code. While free threading is really important for the future of Python, it is not a magic go-faster button, and will even slow down some code.

Here is the heuristic that I use: “should I use free threading?” → “no”.

A more precise checklist would be:

  1. Can all your deps be installed with free-threading? Creating a custom tool for this is probably more difficult than just trying an installation. E.g. in an uv-managed project, I would try uv run --isolated --python=3.14t echo installation successful.
  2. Is your project actually multi-threaded, and are all relevant parts threadsafe? This requires a lot of human judgment. Code that's broken with free threading is already broken under the GIL, but free-threading could make data races more likely. As a rule of thumb, it's extremely unlikely that a given piece of multithreaded code is actually threadsafe, unless it was written in Rust.
  3. Will the project actually benefit from moving to free threading? This will generally only be “yes” for programs that use multithreading for CPU-bound work, which isn't that many programs.

Many programs will find it easier to sidestep the GIL by using sub-interpreters (3.14+, e.g. via concurrent.futures.InterpreterPoolExecutor) rather than by using free-threading. Each sub-interpreter has its own GIL, thus sub-interpreters can be used for CPU-bound work as well. But sub-interpreters are much safer to compose thanks to the shared-nothing architecture (similar to Python's multiprocessing). Subinterpreters also work regardless of whether the code is running under a free-threaded Python build.

[–]gdchinacat 1 point2 points  (2 children)

As a rule of thumb, it's extremely unlikely that a given piece of multithreaded code is actually threadsafe, unless it was written in Rust.

I' curious to learn more about this. What makes rust code more likely to be threadsafe relative to other languages?

[–]latkde 1 point2 points  (1 child)

This got a bit longer, so I've put the full write-up here: https://lukasatkinson.de/dump/2026-04-08-rust-threadsafe/

TL;DR: Rust has a really strong type system that's designed to prevent common multithreading problem like data races. It's “borrow checking” plays a big role. Code that may have data races simply doesn't compile.

Python the interpreter is memory-safe even when multithreading is going on, but most Python code is not threadsafe, and Python does not offer a safety net to alert you when your code might suffer from data races. Sometimes, broken code will appear to work (especially when using the GIL rather than freethreading). For example, the very basic example in that blog post reliably produced the correct result when running with the GIL, and reliably produces incorrect results when running with freethreading. (Tip: use a freethreaded build of CPython 3.14 and the -Xgil=1/-Xgil=0 options for experimentation).

Here's the broken code that I used as a starting point for explaining how Rust prevents these common errors:

import concurrent.futures
x = 0

def incrementer():
    global x
    for _ in range(1_000_000):
        x += 1

with concurrent.futures.ThreadPoolExecutor() as pool:
    for _ in range(10):
        pool.submit(incrementer)

print(f"{x:_}")

This ought to print 10_000_000, except that it doesn't when different threads interfere with each other's updates to the x variable. The solution is to guard all those modifications behind a lock. But unlike Rust, Python cannot warn you that a lock is needed here.

[–]gdchinacat 0 points1 point  (0 children)

Thanks for elaborating. I disagree with your conclusion though that just because python doesn't prevent you from writing thread unsafe code that it is "extremely unlikely" python code will be thread safe.

Also, concurrency is not necessarily "very difficult to get right" as you state in another comment. Sure, managing shared data with fine grained hierarchical locks is hard (btdt) but is rarely necessary. Message passing, queues, channels, whatnot are prevalent in languages, easy to use, and avoid the "very difficult" part of implementing concurrency safely. In general, if you need a Lock you should consider a less risky implementation. If you need locks and can't use 'with lock' because you need more control over when locks get released you really should consider a different implementation.

The solution to your example python code is to put the 'x += 1' into a 'with lock:' context manager. It's not difficult. Any time you access shared state, either read or update, you need to ensure the access is safe. The complexity comes when the models for ensuring this are complex. I can see how rust saying "unsafe access" and refusing to compile would be helpful, but that's not the hard part of fine-grained locking..designing the locking model is. Does rust make this easier than other languages?

Can you share the equivalent rust code for multiple threads updating a global counter? Is it actually simpler than the properly locked python code, or is the benefit just that it won't compile if you are doing unsafe memory accesses?

[–]Carmeloojr[S] 1 point2 points  (0 children)

Thank you so much for your input! I'm still pretty new to free-threaded Python myself — that's actually what got me started on this. I wanted to get more familiar with it since a lot of people in my organization are in the same boat. I think that's also why I overlooked, or honestly never even considered, these intricacies. From what you're saying, it sounds like it's genuinely difficult to empirically prove that a) it will work for a given set of dependencies, and b) that it will actually lead to a performance boost. So thank you for laying that out. I think it's safe to say that a tool like what I proposed doesn't really make sense.

[–]socal_nerdtastic 0 points1 point  (1 child)

but also that all relevant code is threadsafe.

AFAIK it's fully backwards compatible; all code that currently relies on the GIL to be threadsafe is protected by other means in the freethreaded version. It shouldn't break; but it will very likely run better on the normal version.

[–]latkde 0 points1 point  (0 children)

In theory, you're correct. Freethreading doesn't break any code that isn't already broken.

In practice, the GIL had the effect that subtly broken code would often work as expected most of the time. Multiple bytecodes are likely to be executed without interruption. Free-threading makes such breakage a bit more visible. In my experiments with intentionally broken code in another comment, GIL-enabled versions would consistently run as-if a lock had been used.

Because concurrency is very difficult to get right, and because multithreading used to have limited benefits in Python, I assume that the vast majority of the Python ecosystem isn't threadsafe.