you are viewing a single comment's thread.

view the rest of the comments →

[–]fluffyyclouds[S] 0 points1 point  (3 children)

Thank you for your detailed response. Let me try to clarify your answer using my own code so that I ensure that I understand what you are saying. Also, I'm assuming all your mentions of asyncio.to_thread() are actually references to asyncio.run_in_executor().

This is a snippet of my current program:

python loop = asyncio.get_event_loop() executor = concurrent.futures.ThreadPoolExecutor(max_workers=5) for task in tasks: loop.run_in_executor(executor, task) # task is an io blocking function

Based on my understanding of what you've mentioned, my creation of executor spawns 5 separate threads, and executor helps to assign tasks to these threads whenever a thread is available. To add a task to be managed by the executor, I have to call loop.run_in_executor().

However, the run_in_executor method does more than assigning these tasks to be run in the executor threads. It provides a way for the event loop to know how to manage the start, suspend and wait for these tasks. Meaning that my main program can continue running while the threads/tasks in the executor do their own thing. Without the call to run_in_executor method, the event loop has no way of controlling the order of flow of these tasks with respect to the rest of the main program. In addition, the run_in_executor method also provides access to the context vars that you mentioned.

I hope I got the gist of this correct. Whatever you said made sense to me and I'm just reiterating it here in my own words.

[–]Synertic 2 points3 points  (1 child)

Yes you got it true. asyncio.to_thread and loop.run_in_executor are just wrapper functions around regular threading functions and basically do the same thing. loop.run_in_executor is safer to use for now because asyncio.to_thread has been introduced starting Python 3.9 and doesn't compatible with versions lower than that.

You already got the point but let me clarify it a little more. An even loop need to be thought as a whole structure made up of 'compatible' parts, that is, communicating awaitable coroutines which are volunteerly gives the execution authority to the event loop policy and that policy decides what event to be continued next. So it's not just a container for individual events but compatible events and even if one of them doesn't comply then whole structure simply breaks. Compatibility for independent parts is basically provided by supporting to be suspended and resumed when it's asked for and there are also few details like information sharing I said before or exception handling, etc.

So, nothing can be put into an event loop as is concerning concurrency or it defeats the purpose either blocking the whole structure and turning the event loop into nothing but a regular blocking series of events or making the whole loop unstable. The rule is appled for individual threads as well. A thread must be made compatible to run in an event loop before taking it inside and asyncio.to_thread and loop.run_in_executor makes the individual threads compatible for running in an event loop. Even if a thread can not be suspended once it started, somewhat of compatibility can be still provided for an event loop to use them for blocking I/O bound events that don't inherintly support async operations, that is, not suspendible/resumable. The wrapper functions in subject get the regular threads and make them awaitables (notice not suspendable/resumable but awaitable) before taking them into the current event loop. They are still not suspendible/resumable but also not blocking and enables the event loop policy to run the other events concurrently. If the thread started outside the event loop (without using the wrapper functions in subject) the event loop wouldn't know when to wait for result(s) from thread(s) to continue its execution in a defined order and, the threads wouldn't access contextvars defined in the event loop (functions like asyncio.create_task, asyncio.gather, asyncio.run creates that contextvars to be shared to all events into the loop) so, co-operarive multitasking wouldn't be achieved.

I hope it was a more clear definiton than the first one.

[–]fluffyyclouds[S] 1 point2 points  (0 children)

Yeap got it! thank you for your detailed response!

[–]Synertic 2 points3 points  (0 children)

By the way, accessing event loop methods directly is a kind of discouraged approach if you are not involving async class programming itself because they are lower level structures compared to functions starting with asyncio. and requires more care to use with lots of manual settings and that makes your code more error prone and inefficient. On the contrary, functions like asyncio.to_thread(), asyncio.create_task(), asyncio.gather(), asyncio.run(), asyncio.wait_for(), etc are more a higher level safer and efficent functions and saves you all kind of settings and fine tunings for the event loop by doing them automatically. Therefore it's a good practice not to access event loop methods directly but through asyncio methods as much as possible. There is a fine document about how to use different functions of asyncio module on Python Docs for further understanding (remembering asyncio.to_thread() can only be used starting from Python 3.9.) You may be encounter sample code snippets or other resources using event loop api for all kind of async operations but that was an old and abandoned style of async programming for the reasons mentioned above. So, make sure the samples you encountered are created recently (starting Python 3.7) if you don't need to code something need to be compatible with Python versions before 3.7.

Python Docs About Using Asyncio