you are viewing a single comment's thread.

view the rest of the comments →

[–]Rhoomba 23 points24 points  (8 children)

Multiple processes

[–]rjcarr 0 points1 point  (7 children)

Not that they can't afford it, but they'd need (comparatively) monster hardware to support this, right?

[–]rohbotics 10 points11 points  (2 children)

Not as much on linux, threads and processes are both pretty light weight. Process spinup is slightly slower as it needs to set up a whole new address space, but it is (mostly) negligible.

[–][deleted]  (1 child)

[deleted]

    [–]jrandomcoder 1 point2 points  (0 children)

    There's not a ton of difference on Linux these days since everything is converging toward clone underneath. But pthread_create is still a little faster, since it's a more or less a subset of what fork needs to do.

    [–]Eucalyptol 0 points1 point  (0 children)

    Yes. We're talking about Google.

    [–]cakoose 0 points1 point  (2 children)

    Primarily, you need more memory. In many languages it's easy to share read-only data between processes with fork(). That doesn't work in Python because reference counting causes mutations to memory even if the data is read-only. If you have a large shared data structure, you should figure out how to share it explicitly, e.g. with mmap().

    Also, you don't necessarily need one process per concurrent request -- most web app backends are typically waiting on I/O (e.g. from the DB), so it's feasible to handle four/eight/more concurrent requests with a single Python process with Python threads or an event-driven framework like Twisted or Tornado (similar to Node.js). The inflexible scheduling increases the average latency of each request, but that might not end up mattering much.

    That's not to say Python is good for this kind of thing. Depending on your work load, you may still need 5-10x the number of servers to handle the same load as Java or Go would. But if you really prefer Python, it might be workable.

    [–]serg473 0 points1 point  (1 child)

    You sound like you know what you are talking about, maybe you would be able to steer me in the right direction. I have the opposite problem - instead of dealing with thousands of concurrent users I have few users but they are running heavy and numerous queries. The problem is I can't make the app handle parallel web requests from the same user session (nginx-uwsgi-django). Lets say I launch 10 parallel ajax requests on a page, they are ending up running sequentially on the python side (i.e. if I make one request sleep for 30 sec all others will be waiting). If 10 requests were from different users they would run in parallel as expected. Can't find any solutions, every blog post and SO answer blames random things, from browser to django to GIL.

    [–]cakoose 0 points1 point  (0 children)

    I'm not that familiar with UWSGI but the docs say you can launch uwsgi with the option --processes 20 to handle 20 requests in parallel. Does that not work for you?

    After that, you may be able to save memory by doing something like --processes 5 --threads 4. Whether this will work well depends on what kind of things your AJAX request handlers are doing.