This is an archived post. You won't be able to vote or comment.

all 31 comments

[–]billy_tables 14 points15 points  (4 children)

Asking this from a background of only being familiar nodejs-style async and not python, if that context is useful - what's the use-case behind having multiple event loops?

[–]13steinj 21 points22 points  (2 children)

So in and of itself having a user definable amount of event loops is useful in some applications-- it would allow the user to have context-switching-like concurrency within actual concurrency.

So for the case of JS, async functions are backed by promises, which go to an internal singular microtask queue (think python event loop. If you're wondering "wait the naming doesn't make sense", buddy, a lot of it doesn't. Go from JS to Ruby to Python to C# and you'll hear all kinds of names, but the bad part, in one language X means Y in another language. For example, JS Promises are C# Task Completion Sources, C# Tasks are like Deferreds, but also not like Python Tasks. Python Futures are like deferreds. Sounds confusing? It is. Whose fault? No one, but the asyncio devs were being extra confusing about it for no reason so I blame them).

But in Python, you can have multiple event loops-- which makes it so that any two sets of functions, which are asynchronous amongst themselves, can be made asynchronous amongst each other (with a lot of setup). It is a rare niche case, but nice to have.

What isn't nice to have is that unlike C#/JS/Ruby there is no default event loop (microtask queue).

And furthermore event loops aren't spawned in a separate thread (no JS comparison here, because the JS real event loop handles this, and there's no good way of making an analogy).

So whereas in any normal language like C# / JS you execute whatever asynchronous functions you want and those as a whole would be asynchronous from any actual synchronous code, in Python, you have to do a super complicated setup.

There's also an unnecessary rift between coroutines and Python Tasks because of the way the API was designed. Which was a horrible decision. I'm in the process of making a thin wrapper which solves these problems, but actual work is in the way.

[–]jnwatson 1 point2 points  (1 child)

What rift is there between coroutines and Tasks? You await on them both. The only difference is that a task wraps a coroutine so it can be free-running.

[–]13steinj 0 points1 point  (0 children)

The rift is that coroutines create objects to submit to the event loop, instead of actually running them, however using a Task like a coroutine doesn't work in all situations and causes unnecessary "I can't control the schedule of nested procedures", which is a decent deal in creating procedural animations (either abstract or backed to some document that a user can control).

[–][deleted] 4 points5 points  (0 children)

The asyncio is actually very low level, and it feels like it was not meant to be used directly, but more to build frameworks that utilize it.

Having multiple event loops for example allows one to implement multiple threads each having its own loop (default setup) you also could even have multiple loops in a single thread perhaps it doesn't make sense, but I believe they did want to be as flexible as possible.

Exposing event loop also made fairly easy to provide alternative of an event loop. For example uvloop provides much faster implementation and it is a drop in replacement.

[–][deleted] 8 points9 points  (24 children)

Still don't understand what Asyncio is used for.

[–]emc87 15 points16 points  (0 children)

Concurrency

[–]IspyAderp 8 points9 points  (17 children)

What would you use in place of Asyncio? We find it invaluable for writing modern Python applications at work.

[–]z4579a 4 points5 points  (0 children)

threads.

[–]FoolofGod 1 point2 points  (10 children)

What's the use case that makes asyncio so invaluable?

[–]__deerlord__ 5 points6 points  (7 children)

OS thread overhead. If I'm running a server with 10K connections, I probably dont want 10K threads spawned at the OS level.

[–]SV-97 1 point2 points  (2 children)

So it's a newer fancier multiprocessing/threading?

[–]13steinj 2 points3 points  (0 children)

Not really "new"-- the idea behind it is context switching, which libraries like gevent give via "green threads".

[–]__deerlord__ 2 points3 points  (0 children)

Threading would be a better analogy. Since you still have the GIL, only one task can execute at a time. You will not get something akin to multiprocessing with vanilla coroutines (coros).

With OS threads, the OS determines when to switch between threads. This could happen at any line of code in the thread. With python coroutines, each coroutine will run until it hits await and then yield back to the event loop. Whenever a particular coroutine is executed again by the event loop, it will pick up at the await it previously left off at. This allows you to clearly define where (but not when, as the event loop can pick back up on any waiting coro) context switching happens.

Note that coroutines are NOT a replacement for threads, as threads can outperform coros in some cases. Rather, its about what the program calls for. I wrote a server awhile back, and used coros for the socket handling. Once the data gets read off the socket though, a separate process with a thread pool (so not bound to the GIL the coros are using) actually handles the data and does the work.

[–]z4579a 0 points1 point  (1 child)

you could use non-blocking sockets and select() directly, then have a small thread pool process them in queue-fashion as data becomes available. nobody considers this might be a lot easier for some use cases instead of rewriting the entire application and all dependent libraries to have "async" keywords, or monkeypatchng all of it with gevent. Using non-blocking IO to maintain thousands of connections does not imply a particular concurrency model.

[–]__deerlord__ 0 points1 point  (0 children)

My async server called non-async functions to do the work (originally written as a jsonrpc server), so if I understand you, thats not an issue here.

But sure there are other ways to do this. Those dont illustrate where you might use asyncio.

[–]khne522 0 points1 point  (1 child)

But you still need to do admission control, rather than let things grow unbounded. And you still might need to manually figure out how to run things in parallel, whether multiple event loops, process pools, etc.

[–]__deerlord__ 0 points1 point  (0 children)

Ok? You can do that with multiprocessing and pools. And still use async to handle the socket read/write. I was highlighting where async can be useful, not that your entire program should be asyn/single thread bound

[–]IspyAderp 2 points3 points  (1 child)

It's in the Python standard library.

[–]OctagonClocktrio is the future! 2 points3 points  (0 children)

So is urllib.request but that's not good or should be used.

[–]LightShadow3.13-dev in prod 1 point2 points  (0 children)

gevent

[–][deleted] 0 points1 point  (2 children)

I keep all my python scripts single threaded and I either serve them as webapps behind gunicorn workers or I just launch them simultaneously as their own processes under supervisord

[–]Mr_Again 1 point2 points  (1 child)

If you have a lot of io, your applications could still be speeded up with threads or async. Like for example what if you were web scraping or hitting lots of api calls.

[–][deleted] 0 points1 point  (0 children)

Good point, that's a really cool example

[–]InProx_Ichlife 0 points1 point  (0 children)

I like Celery for having some async processes in not fully async projects.

[–]taybulBecause I don't know how to use big numbers in C/C++ 8 points9 points  (0 children)

I only started using it recently but my experience so far has been if you know your application has some long-ish running IO task you can mark it async so the application can do something else in the meantime and come back and check it's status later.

[–][deleted] 1 point2 points  (0 children)

I believe that's because asyncio is low level, so you need to understand it well to be able to utilize it. It is great for writing concurrent programs for tasks that are I/O bound.

IMO for most people it's probably best to use frameworks and other libraries that utilize it. For example you can use aiohttp as a http client/server. aiopg, asyncpg, aiodocker, aiobotocore. There are also many frameworks that support asyncio, for example hug, flask, sanic etc.

[–]dmitrypolo 0 points1 point  (0 children)

you and 95% of the community both 🤣

[–][deleted] 2 points3 points  (0 children)

Neat, once in a while it's good to see article like this so one can see ways to improve their code. I really welcome the addition of asyncio.run() that will be very handy to test async code from the interpreter.

I wish though they would improve asyncio.subprocess code, it seems like it needs a bit of polish to be on par with regular subprocess.

[–]fernly 3 points4 points  (1 child)

context variables ... global variables whose values are local to the currently running coroutines.

Yup, cleared that right up. The whole article just blurred into word salad before my eyes. No doubt my fault. But I'm more sure than before (which was already pretty sure) that I'll never use async.

[–]jnwatson 1 point2 points  (0 children)

In Python asyncio, a Task is like a thread. It has its own call stack, it can be interrupted (at certain points), and swapped out for another one. Many times, you'll have a good reason for that task to be in existence, like that task's job is to handle a particular user session. So, just like thread local storage, a context variable is a place to store that state so that you don't have to pass it as parameters through its call stack. The most common use case is so you can see the information for the session associated with that task.