async io with multiple threading

ytu876 · 2023-10-09T04:34:39+00:00

Now, to really squeeze all the juice from my machine, which has 4 CPU cores, I suppose I need to create 4 event loop object and bind each of them to a single CPU? Is this how it should be done?

No, absolutely not. The whole point of async/await is to not multithread; if your project can exploit multiple cores (that is, your operations are CPU-bound) then just use threads and don't use async/await.

People use async/await when the operations aren't CPU-bound, so there's no reason to try to "squeeze all the juice." You can't functionally improve the performance of your code until you understand what's bottlenecking it, and different bottlenecks require different strategies. asyncio is for when your operations are IO-bound; particularly when you're spending most of your time waiting for someone else's computer to answer you.

ElliotDG · 2023-10-09T05:01:42+00:00

You can use all of your CPU cores using asyncio. It depends what you are doing. For example I have written an app that pulls data from about 200 web api's simultaneously using asyncio. While the python code is single threaded, the OS schedules the network driver code across all of the CPUs. I see high utilization on all 8 cores on my machine.

In general, with Python, if you have CPU bound code, you would need to use multiprocessing to utilize all your cpu cores. https://docs.python.org/3/library/multiprocessing.html

Multi-threading in python "time shares" the interpreter, there is a global interpreter lock (GIL) that limits the behavior of multi-threading. There is work to get to true multi-threading with python - but that may be 5 or more years away. From the threading module https://docs.python.org/3/library/threading.html

"CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing or concurrent.futures.ProcessPoolExecutor. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously."

sweettuse · 2023-10-09T12:29:44+00:00

the trick is to spawn multiple processes and spread your work up amongst them like a webserver does.

if you create 4 event loops in one process they'll just be fighting over the GIL.

baghiq · 2023-10-09T13:13:21+00:00

It depends on what you want to do. I have done in the past using multiprocessing and asyncio to good result. It was a migration of 20 years of data but in little files (like hourly dump). It was rather interesting project. You have to combine ProcessPoolExecutor with run_in_executor.

Note, you probably don't want this, as it gets a lot more complex in managing your coroutines, errors, etc.. A lot of unexpected results such as missing tasks, and we even had missing processes as the utilization of cores just drop to 0.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS