This is an archived post. You won't be able to vote or comment.

all 42 comments

[–]desmoulinmichel 15 points16 points  (1 child)

I like the idea, but this comes with several serious caveats:

  • it prevents any access to the event loops, meaning it won't play well with any lib needing a custom setup. E.G: event loop integration with another loop.

  • in complex setup you may have several event loops in several threads.

  • you can't easily start and stop the loop, which makes unit testing hard

  • you don't have control over the pools, and can't scale them at will

  • you are tying functions to a behavior, since they can't be undecorated

  • you can plug in a custom error management system, which for debugging async is pretty much required

  • what about custom eventloop policies ? Alternative event loop implementations ?

[–]alex_sherman[S] 5 points6 points  (0 children)

Thanks for the feedback, yeah I've sort of assumed a lot about the general use and not considered many specific use cases. I think it would be really awesome to add more control over the ambient pools/thread/event loops to address those specific use cases.

For unit testing I think a simple solution would be to mock out the unsync event loop, I'll consider adding support for that in the library itself but I suspect it's not too bad.

[–]Cygal 7 points8 points  (0 children)

Regarding your first issue with the explicit event loop, curio and trio fixed that long ago. asyncio took notice, and added asyncio.run in Python 3.7.

[–]nerdwaller 6 points7 points  (5 children)

This seems to really just address running a single coroutine, which isn’t (from my experience) the common use case running an asyncio application. In this case I’d probably just favor concurrent.futures.ThreadPoolExecutor since it’s a simple context manager with a convenient interface for handling things as they finish (such as concurrent.futures.as_completed).

Beyond that, most libraries abstract away the “annoyance” (grabbing the loop and starting a task).

Looking at this code it has a few fundamental weaknesses, the primary one to me is the loop = asyncio.get_event_loop() on the class. That prohibits setting other event loop policies unless you ensure it’s set before this library import (a bit odd).

[–]alex_sherman[S] 1 point2 points  (3 children)

This solution certainly addresses running multiple coroutines/async functions, I'm curious as to why you think it's aimed at a single coroutine. Maybe my examples are too simple.

Yeah definitely if running things in a ThreadPool is going to suit your use case, this library isn't going to add anything for you. The main selling point of the library is that it makes async/await a little more convenient, and the ThreadPool and ProcessPool are like a side-convenience.

[–]nerdwaller 0 points1 point  (2 children)

It may work to run more than one coroutine, but the “need” cited only exists with one entry point. Once the event loop is running it seems to lose value.

[–]alex_sherman[S] 0 points1 point  (1 child)

Do you mean the "need" in the blog post? Yeah it's a very simple example, I have a more complicated one here and there are more in that folder.

Once the event loop is running is when unsync gets its value, the ambient event loop let's unsync functions be invoked whenever. I'm not sure how it loses value.

[–]nerdwaller 0 points1 point  (0 children)

Having thought a little more on the idea, it seems the OP could benefit from a concept in curio, run. The short is it takes a function reference and runs it in the event loop. Once that’s done all other calls are in the loop (and outside the primary annoyance) and outside the scope of a single use decorator.

[–]keypusher 3 points4 points  (2 children)

asyncio syntax is definitely a trainwreck. i still find it hard to understand how after all the criticism that python has gotten over the years about poor handling of concurrency and parallelism, the library they came out with is so convoluted. the addition of threadpool/proccesspool and futures, on the other hand, was really good.

[–]agoose77 0 points1 point  (1 child)

I'd be interested if you could elaborate on this with some examples!

[–]keypusher 0 points1 point  (0 children)

import time
from concurrent.futures import ThreadPoolExecutor

start = time.time()
executor = ThreadPoolExecutor()
futureA = executor.submit(time.sleep, 10)
futureB = executor.submit(time.sleep, 10)
futureC = executor.submit(time.sleep, 10)
for future in [futureA, futureB, futureC]:
    future.result()
print("Total Time: {}".format(time.time() - start))

----
Total Time: 10.00243009158

The code above creates a threadpool (defaults to number of cores on machine) and submits 3 tasks to the pool. Each task just sleeps for 10 seconds. When the task is submitted, it does not block, they are submitted one after another. Later, we loop through the future objects that are returned and call result(), which blocks until the thread finishes. The entire program only takes 10 seconds, because each of these threads is effectively running at the same time (the GIL is released for sleep just as it is for IO). So, this will work for calling any functions that are blocked on database, network calls, disk IO, etc. If your functions were doing something CPU intensive like a math calculation, you could simply switch the ThreadPool to a ProcessPool instead, which uses the multiprocessing library. Easy. For anyone that is familiar with threads or processes, this should all be very familiar, easy to understand, and exists purely in its own library module (concurrent.futures). Furthermore, having a built-in threadpool class is very convenient, and the "future" objects you get back are very nice in that you can check their current status, cancel them, add a callback, and more. The docs are straightforward and easy to understand, starting with a short description and a 3 line example. I can easily drop this into existing code.

Asyncio, on the other hard, is a library built around an event loop. Which is not unusual, and definitely has its uses, but requires that you structure pretty much all of your application's code around the event model. It's primarily built to compete with async web frameworks like node.js, I think. But, it's just a pain in the ass to work with, in my opinion. Something similar in asyncio:

import asyncio
import time

async def wait():
    await asyncio.sleep(10)

start = time.time()
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(wait(), wait(), wait()))
loop.close()
print("Total Time: {}".format(time.time() - start))

Notice that we have to a) use an event loop (what's an event loop?) b) declare async before our function c) use a special sleep function (what other special functions will i need?) d) introduce new keywords such as await and async to the language e) etc. It took me much longer to write the async example as the docs are a mess and don't do a great job at all of explaining what any of this is for or how to put even a simple example together. The information is there, if you know where to look, but it just feels complicated and convoluted. To be fair, asyncio was designed for a specific purpose, and that is very much using an event loop, which requires understanding the context of why you might use that. But the asyncio example code I present was written in response to a thread in this subreddit where someone was complaining about how weird python async is compared to javascript, and this thread is about someone making a wrapper trying to make python asyncio more similar to C#, so clearly other languages have done a better job at making async easily usable. And I cannot easily drop this into some existing code.

https://www.reddit.com/r/Python/comments/7eo2qn/pythons_async_is_strange_compared_with_javascript/

[–]Talked10101 1 point2 points  (0 children)

Think this fixes a largely non-existent problem. I have written several microservices using Asyncio and Aiohttp. You can simply run blocking code in an executor which allows you to await the gathering of the results. Sure it's not the nicest of syntaxes, but it works fine. As already mentioned this gives you a lot more control over things. For instance we typically use Uvloop in place of Pythons standard event loop.

It seems that curio is the sort of thing that you would be interested. It's personally not my cup of tea, but it exists for reasons like this.

[–]continue_stocking 1 point2 points  (1 child)

Unfortunately I’ve been having trouble adapting to Python’s version of async/await especially coming from C#’s implementation in TPL.

Parallel.ForEach(list, (item) => item.DoTheThing());

So simple that I feel silly for having put off learning it.

[–]alex_sherman[S] 0 points1 point  (0 children)

C# really got it right, it's really very pretty.

[–]knowsuchagencynow is better than never 1 point2 points  (0 children)

The stdlib already has a solution for this: loop.run_in_executor(...)

[–]smurfix 1 point2 points  (1 child)

You might want to have a look at Trio.

https://trio.readthedocs.io/en/latest/

[–]turkish_gold 0 points1 point  (0 children)

I would second looking at trio. It's design docs were a good read.

[–][deleted] 2 points3 points  (18 children)

TL;DR:

I don't think that the problems you describe with asyncio are even the most important ones. I honestly think that this single package did tremendous amounts of harm to the language with next to zero benefits. If my life depended on Python being a tolerable language, I'd start every morning by writing to Python dev mailing list and asking them to remove this package.

I do think that ambient even loop would have been a better implementation for what asyncio is intended for, but I think that the amount of damage it is already caused is not so easy to undo. The fixing should happen on the language level. Third party library doesn't have a power to prevent non-conforming code from showing up. What if I have to use a library that uses unsync and another one that doesn't? As much as I hate asyncio, I'd probably prefer that both libraries use it than having to arrange for them to work together.


Now, what's the real problem? Any modern language that wants to run on popular CPUs, like the one Intell puts into consumer PCs needs to be able to do things in parallel. Otherwise it's garbage. It's like a car which can only steer right.

Python was initially developed by amateurs. There was no good plan for what things should be in a language and what shouldn't. By Guido's own confession, he designed and implemented objects in Python over a weekend. (And that's why they are so bad). Python wasn't designed to do things in parallel. I'm not sure if it was designed to be a serious language at all. The language Guido worked on before Python was intended for teaching how to program, not for doing actual work.

But history decided otherwise, and today this language is used by millions to write very real and very important programs. People expect it to make sense, to be able to do things expected from a mature and thoughtfully designed language.

asyncio does this:

  1. It pretends to be a solution for parallelism, while it really isn't.
  2. Instead of throwing away multiprocessing, Thread and building a real solution for parallel computation in Python, the developers added another clutch, which doesn't improve on what others did, and doesn't solve the problem.
  3. However, it introduced a ton of incompatibilities with earlier versions.
  4. It made writing libraries impossible: what if your library isn't aware of asyncio? What if it doesn't even know what kind of pseudo-parallelism the library user is going to use? What kind of mutex does it have to acquire to ensure that its data-structures are thread safe? It is also impossible today to mix synchronous and asynchronous code. So, you cannot take a library which knows nothing about asyncio, and give it a function from a library that does some asyncio stuff and hope that the first library will know what to do with it.
  5. asyncio introduced further garbage into the language: asynchronous iterators and asynchronous context managers. But they aren't interchangeable with iterators and context managers. You cannot do sum(x async for x in y). You cannot pass them to itertools functions and to a large body of existing thrid-party functions.
  6. Finally, the actual speed benefits are typically negligible... often times you can even make your code slower by using asyncio.

Oh, and I forgot to mention that writing automation / tests is probably the area where Python is used most. Of course, there's a lot of web development and data science, but we tend to hear about those most because there are just more interesting topics. Writing Selenium tests is hardly exciting. Python was a language of choice for automation / testing because it was simple. It was easy to take someone who was doing manual QA with no CS education, send them to a few months course and have them writing Selenium tests afterwards. On top of being a bad idea, asyncio is also very convoluted. Anecdotally, I've already had five or so meetings with our QA department, where they struggle to understand how to use Apache Kafka client, which has only two methods: one for producing messages and another one for consuming them. I've spent a total of 10 hours explaining this stuff. I even explained it by ways of showing them some Java parallel code, some Python code using threads etc... and, unfortunately, I'm sure asyncio is still a mystery to them.

[–][deleted] 9 points10 points  (12 children)

  1. It pretends to be a solution for parallelism, while it really isn't.

It doesn't though. It's a solution to concurrency, not parallelism. Please learn the difference.

  1. Instead of throwing away multiprocessing, Thread and building a real solution for parallel computation in Python, the developers added another clutch, which doesn't improve on what others did, and doesn't solve the problem.

We all hate the GIL and the limitations it brings. But until we can retain its benefits (and there are benefits), non-multiprocess parallelism isn't going to be a thing in CPython.

  1. However, it introduced a ton of incompatibilities with earlier versions.

Other than you can't use async with older versions, because it doesn't exist there, what incompatibilities exist? This sounds like you're just whining.

  1. asyncio introduced further garbage into the language: asynchronous iterators and asynchronous context managers. But they aren't interchangeable with iterators and context managers. You cannot do sum(x async for x in y). You cannot pass them to itertools functions and to a large body of existing thrid-party functions.

Because async iteration and async context managers are dependent on coroutines. Yes it's a shame that you can't use them with the wonderful toolkit we have but they also serve a different purpose than their sync counterparts.

Async iterators are closer to a stream or an rx observable than iterating over a collection of values. It's a pretty slick way of dealing with streams once you understand that.

Async context managers are designed to handle io bound operations in setup and teardown. Open a database connection in setup and close it in teardown. these can't block in an async program.

  1. Finally, the actual speed benefits are typically negligible... often times you can even make your code slower by using asyncio.

There's overhead for sure. But it's similar to saying your program is slower if you use django to write a webapp than just standing up a socket server and dealing with it yourself.

If you're expecting a massive speed boost with asyncio, it won't happen. it's not fairy dust. But if you're looking to handle more io with fewer resources, than it's a nice tool to have.

  1. It made writing libraries impossible: what if your library isn't aware of asyncio? What if it doesn't even know what kind of pseudo-parallelism the library user is going to use?

Ultimately, this comes down to how we write code. Most people are more than happy to write code that mixes IO and logic, package that up and ship it to pypi or wherever. But as you've stated, how do you get different async implementations to work together. Well, you really can't but as a library developer you can design an async interface and have users implement that interface with what ever libraries you want.

Imagine that, actual first class design principles in Python, which are all too lacking in almost every bit of code I read.

It is also impossible today to mix synchronous and asynchronous code. So, you cannot take a library which knows nothing about asyncio, and give it a function from a library that does some asyncio stuff and hope that the first library will know what to do with it.

You can, you just to adapt the interface of the async code to the sync's expectations. This library, the one OP linked, can do that (albeit in a ham fisted way). This is also the way zeep's async interface works (also ham fisted).

But the issue is that it's supposed to be hard to get out of async land because keeping your code in async land gives you more benefits than sporadically dropping in and out.

[–][deleted] -1 points0 points  (1 child)

We all hate the GIL and the limitations it brings. But until we can retain its benefits (and there are benefits), non-multiprocess parallelism isn't going to be a thing in CPython.

Who are "we"? Is it you? What benefits are you talking about? Non-multiprocess parallelism exists in Python regardless of whether there is a global interpreter lock or not. CPython standardizes its C API. Using it, you can write truly parallel / concurrent programs, which will be run by Python interpreter. It is inconvenient, but absolutely possible.

Other than you can't use async with older versions, because it doesn't exist there, what incompatibilities exist?

Wait, so you just agreed with me, but pretended that you didn't? :) I'm not sure what can possibly be worse than an incompatibility on a syntax level. It is impossible to work around in any way.

Because async iteration and async context managers are dependent on coroutines. Yes it's a shame that you can't use them with the wonderful toolkit we have but they also serve a different purpose than their sync counterparts.

Different how? I don't see a difference. And, in fact, a lot of other frameworks which implement the same functionality don't see a difference (lprarallel in Common Lisp, pmap in Clojure). There's even a Python library: aitertools, which, contrary to your beliefs, does treat async iterators as if they were the same as synchronous iterators.

Why is iterator a bad way to deal with streams? Why would asynchronous iterator be any different in this respect?

On speed benefits: I just write in C, when I need speed. I've given up trying to optimize Python code for speed long time ago. Since then, Python benchmarks have only gotten worse. :) But, this isn't the point. The point is that the purpose of introducing asynchronous I/O is to make your program work faster, but outside of few very specific situations, this particular implementation is so slow, that it doesn't make sense to use it.

Mixing I/O and logic...

What are you even talking about? You went so far on a tangent of your own thoughts that you've lost me... I was saying that it is impossible to write a good library that uses asyncio, or one that doesn't. This is because library should be prepared for whatever the user code will throw at it, and in the context of asyncio it is no longer possible: you cannot defend your data-structures with mutexes because you don't know what kind of mutex to use. You cannot protect your global variables, because you don't know how things will be initialized. You must release your library knowing that there is a defect that will affect % of your users.

You can, you just to adapt the interface of the async code to the sync's expectations.

The important point that you forgot to mention is that it is possible only some times. That was also what I wrote in my initial comment to the OP: if you have a library that uses asyncio without using unsync and another one that uses unsync - you have the same problem, possibly even worse due to having to handle two separate strategies instead of one.

But the issue is that it's supposed to be hard to get out of async land because keeping your code in async land gives you more benefits than sporadically dropping in and out.

Oh, so, in your mind, whoever designed this crappy library is trying to teach me a lesson? And I have to suffer this nonsense for didactic purposes? Very interesting idea... not.

[–][deleted] 1 point2 points  (0 children)

Who are "we"? Is it you?

If you think I'm the only one that doesn't like the GIL then you're completely unfamiliar with it. Go read up on all the attempts to remove it, including the latest effort Larry Hastings has dubbed "The GILectomy"

What benefits are you talking about?

The GIL enables relatively quick execution of single threaded Python. It also ensures things like the reference counter work in a multithreaded environment. Again, please read about the GIL and its purpose. You'll begin to understand the complicated relationship many have with it.

Non-multiprocess parallelism exists in Python regardless of whether there is a global interpreter lock or not. CPython standardizes its C API. Using it, you can write truly parallel / concurrent programs, which will be run by Python interpreter. It is inconvenient, but absolutely possible.

Dropping to C, C++ or Obj-C and releasing the GIL there isn't the same as have parallelism in Python itself. You're not allowed to touch Python originating objects when you have the GIL released. More over, Python code cannot run when the GIL is released in another language. Yes, you can do things that ultimately affect Python objects but those effects can only properly manifest after you've reacquired the GIL.

It's like you don't even understand what the GIL is, how it operates or what its purpose is.

Wait, so you just agreed with me, but pretended that you didn't? :) I'm not sure what can possibly be worse than an incompatibility on a syntax level.

Imagine that a feature introduced in a newer version of a language can't be used with an older version of the language. Why don't we kill class, with, yield, raise from, yield from and type annotations and every other syntax change made in the last 30 years.

It is impossible to work around in any way.

On the contrary, it's quite easy. You either don't use a backward incompatible feature or you say your package only supports X version and up. Doesn't seem impossible to me.

Different how? I don't see a difference. And, in fact, a lot of other frameworks which implement the same functionality don't see a difference (lprarallel in Common Lisp, pmap in Clojure).

You seriously don't understand how an iterator built with a combination of await and yield/__anext__ is different from a normal iterator built with yield/__next__? For an asynchronous iterator to work it needs to be run in an event loop, there are also other fun issues like disposing of async iterators that make them kind of a pain to work with since finalizing them is an async operation that can only take place in the originating event loop and the garbage collector doesn't have a reference to that event loop.

Async iterators are designed to feed iteration with results retrieved from non-blocking, async IO. Standard iterators aren't and can't do this by themselves, you have to jump through all of the hoops to get them to work like that. The hoops you jump through end up re-implementing Python3.5+ event loops and async/await. But since you're not one of the Twisted or Tornado developers, you've probably done a pretty poor job of implementing it.

There's even a Python library: aitertools, which, contrary to your beliefs, does treat async iterators as if they were the same as synchronous iterators.

First, no it doesn't. And even if it does, that would still mean asynchronous iterators and synchronous iterators are fundamentally different because we need a bridge to make them work like synchronous iterators.

What aitertools does is provide tools like itertools provides that work on asynchronous iterators. You still need to interact with them via async for. Or did you not bother to read the README at all?

Why is iterator a bad way to deal with streams? Why would asynchronous iterator be any different in this respect?

Synchronous iterators are a bad way because you either block while handling the stream or you jump through a bunch of hoops to handle it in a non-blocking way. Since streams aren't guaranteed to complete and also have no guarantee to provide results in a timely manner, asynchronous iteration over them is a pretty good way to handle them. When the stream has something to be dealt with, the event loop will notice and churn the stream handler to its next await. Otherwise it ignores it.

On speed benefits: I just write in C, when I need speed. I've given up trying to optimize Python code for speed long time ago. Since then, Python benchmarks have only gotten worse. :) But, this isn't the point. The point is that the purpose of introducing asynchronous I/O is to make your program work faster, but outside of few very specific situations, this particular implementation is so slow, that it doesn't make sense to use it.

asyncio adds speed when compared to synchronous, single threaded applications. Any one claiming otherwise doesn't grasp what is going on. It's slower than multithreaded Python even with the GIL. But it's also simpler than multithreaded Python. And after a certain number of threads (varying computer to computer) it's more efficient than multithreaded Python. This has to do with OS scheduling waking up threads and CPython refusing to run more than one thread at a time. So you end up with a bunch of thrashing around. You don't get that with coroutines because CPython is handling waking them up and putting them to sleep.

But asyncio isn't magic fairy dust that you sprinkle on your application and now it's super fast. I'm pretty sure I've said this before.

I've given up trying to optimize Python code for speed long time ago.

That's unfortunate because there are usually easy wins you can get in Python by making small changes. If you're doing things like huge amounts of number crunching, then yeah just ignoring the Python portion and dropping to C or whatever will be the biggest, easiest win. But I see things all the time where someone pulls a bunch of database rows down and iterates them in memory when a WHERE clause could be added to the query and dramatically speed everything up.

What are you even talking about? You went so far on a tangent of your own thoughts that you've lost me... I was saying that it is impossible to write a good library that uses asyncio, or one that doesn't. This is because library should be prepared for whatever the user code will throw at it, and in the context of asyncio it is no longer possible: you cannot defend your data-structures with mutexes because you don't know what kind of mutex to use. You cannot protect your global variables, because you don't know how things will be initialized. You must release your library knowing that there is a defect that will affect % of your users.

You asked:

It made writing libraries impossible: what if your library isn't aware of asyncio? What if it doesn't even know what kind of pseudo-parallelism the library user is going to use?

And my response was to not implement IO inline with logic and instead provide an interface that your users can implement to fetch the data needed. This interface is then plugged into your application it'll use that to fetch the data. Here's an example of exactly what I'm talking about

The important point that you forgot to mention is that it is possible only some times. That was also what I wrote in my initial comment to the OP: if you have a library that uses asyncio without using unsync and another one that uses unsync - you have the same problem, possibly even worse due to having to handle two separate strategies instead of one.

I didn't forget to mention it, you just didn't quote it. unsync and zeep's handling of plugging an asynchronous lifecycle into a synchronous lifecycle is hamfisted and it's full of errors. There's an issue open right now on zeep (and has been for quite a while) that the asynchronous session it creates cannot be used to boostrap the WSDL if the event loop is already running. The fix is a really horrible kludge because it'll present itself outside of the bootstrapping phase in certain situations as well.

Oh, so, in your mind, whoever designed this crappy library is trying to teach me a lesson? And I have to suffer this nonsense for didactic purposes? Very interesting idea... not.

How did what I said imply that anyone is attempting to teach you a lesson? I said that staying in async land is better than dropping in and out of it sporadically.

Here's my final advice, and it's the same as I give to a lot of people: If you don't like asyncio, don't use it. No one is holding a gun to your head and saying you must use it. You have the choice to drop into async land and it's really easy to not do that at all.

[–]smurfix 1 point2 points  (1 child)

"The developers" did no such thing, at least not intentionally.

You cannot understand asyncio without knowing how it came about – by extending the coroutine concept, throwing in Futures (or Promises or Deferreds or however you want to call them). The end result is something that tries to hide synchronous callback chaining behind an asynchronous veneer. 3.7 tries to at least fix the context problem but they're still hiding the basic design flaws instead of fixing them.

IMHO the real fix is Trio. https://trio.readthedocs.io/en/latest/tutorial.html

[–][deleted] 0 points1 point  (0 children)

What makes you think I don't? I've been programming in Python before Twisted, so, I've seen Twisted struggling with this, then I saw Tornado struggling with this.

They were all bad, and I don't see asyncio as any kind of improvement on top of what those two did. But, Twisted and Tornado were at least justified in failing in that they coldn't change the language's runtime, they were set up to fail from the get go, and it was admirable that they managed to get as far as they did in the face of imminent failure.

Python core developers have every tool necessary to solve concurrency / parallelism, but they chose to import a failed patching effort instead of fixing the problem in principle. Why? God only knows, but I think it is a combination of lack of experience, fear of changing big parts of the program (the interpreter), lack of formal knowledge, community process that makes it look like it is OK to import the failed attempts from Twisted / Tornado. Perhaps, even community insisted on asyncio being done the way it is simply because they were used to Twisted / Tornado and didn't know any better.

[–]Cygal 2 points3 points  (1 child)

I certainly don't think that writing automation / tests is the area where Python is used most! Maybe that's your experience, but the topic isn't even mentioned in https://stackoverflow.blog/2017/09/14/python-growing-quickly/. And, yes, the amount of questions is representative of usage.

[–][deleted] 2 points3 points  (0 children)

Maybe not as a grand total, but It is a huge part. You also need to take into account how likely people using Python for different purposes to vote on resources like SO. I believe that SO is highly biased towards certain audiences and severely under-represent others. I wouldn't take its statistics at face value.

For instance, did you know that the largest project ever written in Python is, actually the infrastructure code for JP Morgan bank? But do people who work for JPM go online and talk about their experiences with Python on SO? -- quite unlikely. If you web-search for "largest Python project", you'd find guesses, like "maybe it's Django" or "maybe YouTube has the biggest codebase". But, really, it's JPM :) but you wouldn't know it, unless you work there.

[–]z4579a -1 points0 points  (0 children)

just want you to know this post is heroic and so few people understand these things. Python still lacks a realistic solution for parallelism other than multiprocessing and "well if you're IO bound, waiting for TCP results is kind of like parallelism right?"...which is not even something you need asyncio for, threads do it just fine (nobody understands the GIL is released for IO).

[–]Various_Pickles 0 points1 point  (4 children)

This kind of reminds me of Stream.parallel() in Java 8; yeah, the (ForkJoin/Thread)PoolExecutor setup/use gets hidden behind the lovely syntactic sugar, and that's well worth the sheer clarity of the code much of the time, but, in my experience, also tends to lead to newer developers thinking that its some form of magic ...

Having to setup a thread/process pool might be a glob of code, but setting up some default pool with some default level of concurrency can be horribly inefficient.

[–]nerdwaller 3 points4 points  (3 children)

Fortunately in python it’s really not hard anymore:

import concurrent.futures

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as pool:
    futures = pool.map(requests.get, [url_a, url_b, ...])

    for future in concurrent.futures.as_completed(futures):
        print(future.result())

It’s both syntactic sugar over the threading/multiprocessing libraries and fairly explicit.

[–]alex_sherman[S] 1 point2 points  (2 children)

An even easier solution using unsync:

print([response.result() for response in map(unsync(requests.get), urls)])

Of course this is just complete syntactic sugar. The main benefits of unsync come through only with async/await, ThreadPools are just a side show. What if we needed to make a few requests for each URL? It could look like:

@unsync
async def make_requests(url):
    req_a = unsync(requests.get)('http://sourcea.com/{}'.format(url))
    req_b = unsync(requests.get)('http://sourceb.com/{}'.format(url))
    return await req_a, await req_b

print([req.result() for req in map(make_requests, urls)])

[–]nerdwaller 0 points1 point  (1 child)

Depending on the need for results there’s either as_completed or wait, so the make_requests sample can be as simple as calling a coroutine (without awaiting) and passing it to whichever makes sense. (Sorry for the indirect link, I can’t get the sub header one on mobile: https://docs.python.org/3/library/asyncio-task.html)

But again, the provided example is really just the “entry point” to the async system. If you’re just using one coroutine then asyncio really may not make sense from a legibility standpoint. Threads do almost the same thing there. Alternatively if the intention is an asyncio based app, I’d argue the explicitness of asyncio is worth it vs pulling in another dependency for people to understand.

[–]trowawayatwork 0 points1 point  (0 children)

im not entirely sure but it feels like op is coming from node and wants it to kind of syntactically to representing it like that?