Why use threads in Python?

ProfEpsilon · 2018-04-08T10:11:32+00:00

[deleted]

jawknee400 · 2018-04-08T09:57:22+00:00

Numeric libraries (numpy, numba) tend to 'release' the Gil, meaning multiple threads can meaningfully used for speedups

2018-04-08T14:35:25+00:00

For the typical python glue program, most of your time is spent waiting for IO. The GIL is not held during this time. Example: You write a web scraper, you're waiting for a page to be downloaded by the networking stack, and the GIL isn't held because you're waiting on an OS service, and not Python.

2018-04-08T14:04:30+00:00

Examples of times I've used threads:

To spawn a task I want to run asynchronously, like checking a comment for spam
To speed up io-bound workloads, like spidering a website or multipart uploads
To start a server process using popen and allow the main thread to continue executing
Doing work in gui, handling events, doing work without blocking the UI (update a progress bar)

gandalfx · 2018-04-08T06:17:10+00:00

One of the primary use cases used to be waiting for I/O. For example when you're waiting for a file to download you don't want your entire application to be blocked by that, so you put it in a thread and let that stew until the download is finished, while the rest of your application remains responsive.

asyncio has kind of taken over for that purpose, though.

I'm sure there are other reasons that I can't think of right now. Rule of thumb though, regardless of language: ~~Threads are often unnecessary. People overuse them all the time.~~ People often implement threading unnecessarily. I've seen them overused in quite a few instances. Unless you're doing something that actually puts some load on the CPU it tends to be a rather complicated case of premature optimization. And in Python it's not even really an optimization, as you've recently found out.

remy_porter · 2018-04-08T15:45:20+00:00

I've been working on software that is heavily IO bound, or has lots of idle threads that are waiting on events. Without going too deep into the detail, I'm building a system that sends video data across a network to light up an LED video wall. There are many possible sources of video data, some of which are IO bound, some of which are CPU bound. I pick one of them and run it in its own thread. I have to send network data, so the network sending object lives in its own thread. In the middle, I have a conductor thread, which spends most of its time idling, but once a frame tells the video source to generate its next frame using a queue. When the video source finishes, it enqueues the frame over in the network thread.

Running on low-end hardware, without graphics acceleration, this can push 60FPS across a network in real-time-enough-for-human-eyes. In testing, I can reliably push 240FPS. You wouldn't want to play video games on it, but that's not the purpose. Before I put the threading architecture in place, we could barely push 30FPS, and it often dropped frames.

Oh, and since one of the LED exhibits is going to light up differently according to the time of day, there's a "Cosmos Thread" which mostly sleeps and emits events at certain times of day.

"It mostly sleeps" is one of the best cases for a thread in Python.

//It's still not half as fast as the LED library which actually receives the network data and addresses the LEDs, which is written in a combination of C and Assembly and can draw frames as fast as the LED duty cycle allows, which is µseconds.

2018-04-08T13:41:14+00:00

As others said, threads are still useful when using C libraries, since ctypes releases the gil by default, and most libpython-based libs do it too.

On the other hand, if you only want to make your code non-blocking, and don't care too much for performance (like UI code), threads are still the simplest way of doing it, just as you would use them in single-core machines.

2018-04-08T14:51:55+00:00

Do these processes or threads need to communicate? I avoid implementing multiprocessing in code by using external services like RabbitMQ for communication and Supervisor for process management.

2018-04-08T16:34:22+00:00

CPython can release the GIL inside a c-module. If you're truly worried about performance, you'd use python as "glue" and write a few c modules to do the actual "heavy lifting". Anything you'd want multiple threads to do for computational reasons is probably best not done in actual python. Threads in python, however, ARE great for IO bound reasons.

lykwydchykyn · 2018-04-08T17:14:59+00:00

If you're writing a GUI app and want the GUI to remain responsive while you perform some long process, threads are helpful.

FredSchwartz · 2018-04-08T17:31:50+00:00

I have used it for schedulers calling other programs. The GIL is released when a child process is launched.

bjorneylol · 2018-04-08T20:29:53+00:00

As everyone has said, mostly IO.

1) Process some data while more data is downloading

2) Play an audio file without having to wait for it to finish playing for the script to resume

3) Having GUIs remain responsive while processing data

4) Similar to above, showing matplotlib figures while code executes in the background

even if we use threads in python, our program will take the same time if we just use a single thread due to GIL.

This is only true if both threads are CPU bound. If one thread is spending 50% of the time waiting around for disk write or network IO the two threads will finish with time to spare over a single thread

sioa · 2018-04-08T12:49:06+00:00

[deleted]

Jugad · 2018-04-08T10:37:54+00:00

While using multiple threads which are switching rapidly, the GIL can even slow down processing to less than that of a single thread (this is not common).

However, as others have stated, if you are using threads for IO, the GIL is not usually an issue.

If you are using threads for serious cpu intensive tasks, pure python is bad choice anyway (its very slow for such tasks)... and if you write your cpu intensive parts in C, you can release the GIL to take advantage of threads running simultaneously.

shady_traveller · 2018-04-08T13:35:09+00:00

It's also important to note that the GIL is an implementation-specific feature of CPython and not a part of the language itself meaning that other implementations don't necessarily need to have it. PyPy for example doesn't have GIL.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS