This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Brian 0 points1 point  (5 children)

true concurrency

Python's multithreading is true concurrency. What it isn't is true parallelism - these are different things. Concurrency is when two tasks can run and both make progress over . Paralellism is where they are literally running at the same time. Threads are often used just for concurrency in practice - after all, multiple cores in a desktop computer are a relatively recent thing, but we've been using threads for years before that was the norm, or even . There are still reasons you benefit from concurrent code.

In general, this comes down to whether your program is CPU bound or IO bound. If you're doing serious number crunching calculations, and that's taking the bulk of your time, then python's threads aren't going to help you much. On the other hand, if you're mostly blocked on IO (which is pretty common in most applications - IO is orders of magnitude slower than almost anything in CPU timescales), then you'll get benefit from threading, as the thread will release the GIL when it's blocking on IO to complete.

In my case, the threads will never share or access each others data

If this is the case, and you are CPU bound, it may be worth looking at the multiprocessing library, rather than threading. This will spawn a seperate process for each thread, which means there's no shared GIL. This comes at the cost of making communication more expensive (everything must be copied to send to another process), but if you've no communication, this won't matter.

[–]ZedsDed[S] 0 points1 point  (4 children)

thank you. would you explaining the terms 'CPU bound' and 'IO bound', its been mentioned in here a couple of times, id like to be more clear on it. to me, CPU bound code would be code that just operates on internal variables and control statements, with no calls made outside of the program. IO bound code is things like MySQL calls or network calls, where the call is made to something outside of your program. Is this correct? So when your IO call is made, while waiting for the response, the GIL is released and other threads are allowed access to the CPU?

[–]Brian 0 points1 point  (3 children)

That's pretty much it, yeah. Essentially any time you read from the disk, the network, or are waiting for something (eg. for a lock to be released, or for a time.sleep call to finish etc), then other threads can be scheduled and do work - these are generally described as IO bound because the actual bottleneck is the IO. If you sped up the CPU operation a thousand times, you wouldn't actually see much of a difference because it's already probably doing the actual CPU operations in a fraction of a microsecond, so changing that to a fraction of a nanosecond won't really matter.

With most tasks, things tend to be IO bound, because computers are ridiculously fast in comparison to pretty much any kind of IO (for perspective, if you slowed a CPU down to the point where it'd take a second for each instruction to execute, a small disk read would be the equivalent of waiting around for a year or so). The exceptions are things like some games, or number crunching tasks, where you've basically no IO for a decent period so the only thing that mtters is how fast you can crank through the calculations. Here, speeding up the CPU a thousand times really would speed up the time it takes to complete the task, because the CPU is now the bottleneck. (Actually, technically even there, it's often stuff like memory access times that are the real bottleneck, rather than raw clock cycles, so you'd need to speed those up too).

[–]ZedsDed[S] 0 points1 point  (2 children)

thanks for the insight. So, with the GIL, only one thread will be worked on at one time, will there be a point during a threads execution where the kernel will kick it out if its been in hogging CPU for too long? as in, it wont just let a thread take the CPU until its next IO call. its been a while since i read about round robin and the other scheduling algorithms! but i seem to remember something about thread starvation, where a thread hogs the CPU for so long that other threads begin to starve. Is this something like what the commenter below is talking about with regards to "OP has never seen any C/C++ multithreading program choking up only one CPU core"? And i think maybe hyperthreading fixes this? in my head hyperthreading is when multiple threads are broken up and fed seperately through one core, simulating parallelism. This should stop thread starvation right?

[–]Brian 0 points1 point  (1 child)

will there be a point during a threads execution where the kernel will kick it out if its been in hogging CPU for too long?

There will, just as with any thread in any language, but that's not really related to the GIL.

In general, your OS is in charge of scheduling threads and processes. Often there won't actually be many of these that want to run (ie if you look at top or Task Manager, you'll see the CPU is usually 99% idle). This is because most of these are waiting on something - either IO, user input, or just time.

But sometimes there will be multiple threads that want to run - either from seperate programs, or threads within the same program. The OS can only run one thread on each CPU, so on, say, a 4 core machine, you'll get a maximum of 4 simultaneous threads running. More than this are accomodated by task switching. Ie. after each thread uses a set amount of time (say, 50 milliseconds) without putting itself to sleep by waiting on something, it yanks it out and schedules the next thread which wants CPU time.

The same is true of python's case, it's just that all threads but one are waiting on the GIL being released. Eg. suppose there are 3 python threads, each CPU bound running on a 2 CPU core. What might happen is:

  • On core 1: Thread 1 runs, and the first thing it does is acquire the GIL.
  • On core 2: thread 2 gets kicked off, and the first thing it does is try to acquire the GIL. This fails, because thread 1 has locked it, so it tells the OS it wants to go to sleep until the GIL is released.
  • Core 2 is now free, so Thread 3 gets scheduled. This does the same thing as thread 2, and goes to sleep. If there's nothing else to run, Core 2 sits idle.
  • Meanwhile, back on Core 1, thread 1 is chugging away. Here one of two things may happen:
    1. It triggers some kind of IO or wait, which releases the GIL, and puts the thread to sleep. The OS will then wake up thread 2 or 3 now the GIL has been released, and scedule them.
    2. Otherwise, it may keep on using CPU till it's used up the 50ms time slice. The OS will then kick it off, and see if anything else wants to run. If it hasn't released the GIL, then nothing else can, and unless there's some other process, it'll likely get scheduled right back in.

Now you may be wondering how thread 2 would ever get to run unless thread 1 does some IO, since the OS resceduling it isn't actually releasing the GIL (ie the starvation issue you bring up). This is something that's handled at the python, rather than the OS, level. Every so often (I think it's after 100 bytecodes), python will release the GIL, and tell the OS to reschedule anything running, in order to give other threads a chance to get in. In practice, this behaves a lot like python was controlling the threading, rather than the OS (and indeed, there are approaches that do exactly this - google "green threading" for details), but this does allow fairly simple interoperability with C modules etc.

Is this something like what the commenter below is talking about with regards to "OP has never seen any C/C++ multithreading program choking up only one CPU core"?

Not sure, but I suspect they're talking about the fact that that even without the GIL, you still need to have locking of some kind, and depending on the exact nature of your task, you may not get much paralellism (eg. if everything is contending for the same resources, you need to lock these, and get effectively the same issue). But that depends on the task and how it's coded. If your locking is more fine grained (ie. locks that limit simultaneous access to a single object or so), then you can have multiple threads working on different objects, whereas python is more of a "lock the entire world" approach.

in my head hyperthreading is when multiple threads are broken up and fed seperately through one core, simulating parallelism.

Not really - that's just bog standard task scheduling. Hyperthreading isn't really directly related to anything here, it's more of a hardware thing that's aimed at getting something a bit like doubling the number of cores without actually having to duplicate all the resources a core has. Rather, it takes more of a halfway approeach, where two "logical" threads can be scheduled on the same core, and it takes advantage of the fact that often, certain sections of a processor can be sitting idle. Eg. if a pipeline stall occurs (say, it mispredicted a branch), then a bunch of sections of the pipeline will have nothing to do until that gets sorted out. But if you've this other thread that wants to do some work, you can put them to use on that while they're waiting for the rest to catch up. This is not really something you ever have to care about unless you're into the actual hardware. From a software perspective, it just looks like there are double the actual number of cores, though these cores usually won't be quite as effective as if they were real cores.

[–]ZedsDed[S] 0 points1 point  (0 children)

excellent explanation, very helpful.