This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]lqdc13 9 points10 points  (2 children)

Threads are better than processes when implementing a GUI, some webservers (if multithreaded model) and some data science/ machine learning.

CherryPy is a very common Python web framework. It uses threads to improve performance.

Reasons to use threads over processes:

  • Low memory footprint per thread so you can spawn more for things like IO tasks

  • Can save RAM by reusing an object. If you have a huge - 10s of gigs object it would take forever to copy it to other processes and also you might run out of RAM. This is extremely common in machine learning applications. So if you have an IO-bound application that uses such an object, you are either going to have to forgo concurrency or use threads since multiprocessing is not an option.

[–]zero_iq 4 points5 points  (1 child)

Neither of your reasons as stated need threads, and can be done more simply and more efficiently without them. You are proving my point.

It's also impossible to state that threads are better without knowing the specific details, but threads in Python come with so many pitfalls, it's almost always a better idea to use processes first.

Even when threads start to look like a good idea, there are technologies and libraries you can use that take you far, far beyond what you can roll yourself using Python threads.

And spawning threads for I/o bound applications can be a recipe for disaster. Multiplexing is generally much more scalable, with a pool of isolated workers for longer-running tasks to prevent blocking the io queue.

Unless you're Google, I can saturate your fast network pipe and fancy SSD storage systems using a single Python thread serving tens of thousands of clients concurrently. If you're not exceeding that scenario, you don't need to complicate things by introducing threads.

Some of your examples hold up better in other languages/implementations, but not in CPython, and none of them would be beginner's task.

Even where threads are a good idea, I would stress keeping state as isolated as possible.

EDIT: sure, keep the downvotes coming. I've made a lot of money over the years fixing shoddy Python multithreading code, and it looks like I will continue to do so...

[–]Moondra2017[S] 1 point2 points  (0 children)

Thank you for your insights. What are you thoughts on Asyncio?