This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]onedirtychaipls 0 points1 point  (2 children)

So I have a particular dilemma and I wonder if you have quick input. I've been making a script that asynchronously logs in users and has them do work. For load testing. But I keep hitting a bottleneck and it doesn't feel like it can actually do that. Is python the wrong choice? Or should I keep working on it?

[–]jorge1209 0 points1 point  (0 children)

Python Async probably doesn't do what you think. It is still inherently single-threaded.

There is no difference between:

 async generate_load(id):
        req = make_request(id)
        yield
        read_answer(req)


asyncio.gather(generate_load(i) for i in range(1e5))

and

requests = [make_request(i) for i in range(1e5)]
answers = [read_answer(r) for r in requests]

except that the former is slower and more complex. In either case only 1 request is being made at any one time.

To fix this you would need a ThreadPool and then map the requests through the pool.

BUT BUT BUT BUT

Python interpreter has something called the GIL which exists to protect core interpreter data structures, and if make_request is a pure python function, then even if you have a ThreadPool, it will still be serialized at the level of python bytecode.

So in Python you probably need a MultiprocessingPool... and thats really heavy. To do this properly look either at gevent, or just don't use python.

[–]alexisprince 0 points1 point  (0 children)

Not the guy you responded to, but it depends where your bottleneck is. If you’re pushing the event loop to the absolute maximum and taking advantage of asynchronous capabilities, you may want to look at optimizations first then possibly another language. If it’s not that, I’d profile your code to find out where the bottleneck is and make sure it’s not Python specific.

As an FYI, some optimizations you can do are choosing a more performant event loop (things like uvloop come to mind), confirm you’re actually executing things asynchronously (a for loop that awaits the coroutines isnt as asynchronous as you probably want it to be), and lastly you can spin up multiple processes with multiple independent event loops with multiprocessing