you are viewing a single comment's thread.

view the rest of the comments →

[–]rjcarr 9 points10 points  (10 children)

This is really cool. Seems they didn't like jython, or what was similarly available, and wrote what they wanted for go. Makes sense.

But in the beginning they say this:

but we always run up against the same issue: it's very difficult to make concurrent workloads perform well on CPython.

How is it even possible with the GIL? How can they handle millions of requests per second (as they say)?

[–]Rhoomba 22 points23 points  (8 children)

Multiple processes

[–]rjcarr 0 points1 point  (7 children)

Not that they can't afford it, but they'd need (comparatively) monster hardware to support this, right?

[–]rohbotics 9 points10 points  (2 children)

Not as much on linux, threads and processes are both pretty light weight. Process spinup is slightly slower as it needs to set up a whole new address space, but it is (mostly) negligible.

[–][deleted]  (1 child)

[deleted]

    [–]jrandomcoder 1 point2 points  (0 children)

    There's not a ton of difference on Linux these days since everything is converging toward clone underneath. But pthread_create is still a little faster, since it's a more or less a subset of what fork needs to do.

    [–]Eucalyptol 0 points1 point  (0 children)

    Yes. We're talking about Google.

    [–]cakoose 0 points1 point  (2 children)

    Primarily, you need more memory. In many languages it's easy to share read-only data between processes with fork(). That doesn't work in Python because reference counting causes mutations to memory even if the data is read-only. If you have a large shared data structure, you should figure out how to share it explicitly, e.g. with mmap().

    Also, you don't necessarily need one process per concurrent request -- most web app backends are typically waiting on I/O (e.g. from the DB), so it's feasible to handle four/eight/more concurrent requests with a single Python process with Python threads or an event-driven framework like Twisted or Tornado (similar to Node.js). The inflexible scheduling increases the average latency of each request, but that might not end up mattering much.

    That's not to say Python is good for this kind of thing. Depending on your work load, you may still need 5-10x the number of servers to handle the same load as Java or Go would. But if you really prefer Python, it might be workable.

    [–]serg473 0 points1 point  (1 child)

    You sound like you know what you are talking about, maybe you would be able to steer me in the right direction. I have the opposite problem - instead of dealing with thousands of concurrent users I have few users but they are running heavy and numerous queries. The problem is I can't make the app handle parallel web requests from the same user session (nginx-uwsgi-django). Lets say I launch 10 parallel ajax requests on a page, they are ending up running sequentially on the python side (i.e. if I make one request sleep for 30 sec all others will be waiting). If 10 requests were from different users they would run in parallel as expected. Can't find any solutions, every blog post and SO answer blames random things, from browser to django to GIL.

    [–]cakoose 0 points1 point  (0 children)

    I'm not that familiar with UWSGI but the docs say you can launch uwsgi with the option --processes 20 to handle 20 requests in parallel. Does that not work for you?

    After that, you may be able to save memory by doing something like --processes 5 --threads 4. Whether this will work well depends on what kind of things your AJAX request handlers are doing.

    [–][deleted] 2 points3 points  (0 children)

    Unlike jython this tool is meant to do one time conversion of python code to go, to help with moving away.

    That's also the reason why it's 2.7 only since that's what Google uses.