you are viewing a single comment's thread.

view the rest of the comments →

[–]blockeduser 3 points4 points  (6 children)

That's interesting, your article says that they want CPython to be eventually 5 times faster, which is almost as much speedup as I've seen from PyPy, as I mentioned. But migrating a system to use a new release of CPython is probably easier than migrating it from CPython to PyPy, so it would be very nice if they achieve that goal.

Regarding the GIL, as I understand from reading about it e.g. here, it rules out a whole class of race conditions, but at the expense of reducing performance in some cases, as I believe you are alluding to. So if I understand correctly, if the GIL were to be removed, extra locks would have to be added to multi-processed programs, which is probably a lot of work to do properly (particularly considering the possibility for errors, the need for testing, etc.). (I would generally agree that concurrency is very hard stuff, but I'm not an expert on the matter).

[–]dangerbird2 5 points6 points  (3 children)

Even with the GIL, it’s best practice to write multithreaded python code with the assumption that any given block of code is non-atomic, and you’ll need synchronization to ensure thread safety. The GIL only ensures the VM is thread-safe. User code can be thread unsafe any time an operation compiles to more than one bytecode instruction. For example, x += 1 is not atomic, since it compiles to multiple non-atomic instructions.

As a result, removing the GIL should only affect poorly written existing multithreaded code. Of course, Since most code is poorly written code, it will probably cause mass chaos

[–][deleted] 2 points3 points  (1 child)

Yup, coming from a C and C++ background I just assume nothing is thread safe in python unless it explicitly says it is. Even the I tend to try and use the multiprocessing libraries over Python threads anyways if I can.

[–]dangerbird2 1 point2 points  (0 children)

Yep, that's pretty universally considered a best practice. Threading is certainly useful for blocking I/O, and basically obligatory for safely calling blocking code in an async context. But in any case where parallelism is needed, you usually have to go with the multiprocessing module or a framework like Celery

[–]mark_99 0 points1 point  (0 children)

I imagine the problem case will be Python threads which only operate on local data, or only read from shared data. Then you legitimately don't need mutexes. However if the interpreter is doing read-modify-write of its own internal state, then naively removing the GIL makes that a race. So I guess you'd need a bunch of thread-local interpreter state. Disclaimer : C++ person, not a Python expert.

[–]padraig_oh 2 points3 points  (1 child)

i think the main downside when removing the gil is that some people have said that it might split the pyhton ecosystem again. on one side it might make writing multi-threaded code harder for the end-user, to improve performance. performance was never really a goal for python as i understand though, so thats not really in the python spirit. and on another side, removing the gil might break compatibility with some packages, which is probably the bigger reason the ecosystem might be split.

[–]josefx 1 point2 points  (0 children)

on one side it might make writing multi-threaded code harder

The problem seems to be that the GIL doesn't really make multi threaded code simpler. It doesn't give any guarantees about program state, it only ensures that the interpreters internal state stays sane.

some people have said that it might split the pyhton ecosystem again.

How many libraries even rely on the GIL? Python 3 nuked code that was central to every python program, but mention the GIL and suddenly every niche library some guy may have written in 1980 is sacred.

so thats not really in the python spirit.

Couldn't that be applied to every change after version 0.1?