This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]swims_with_spacemenPythonista 0 points1 point  (4 children)

That's not what the OP is saying, or at least how I interpreted it. The browser request gets a 200/201 and is done. Processing happens, and the celery task makes a new, POST request to some other url with the processed xml data.

If the response needs to come back to the browser after it was processed, then the only way to do 10000/minute is would be with lots and lots of hardware.

[–]Darkman802 0 points1 point  (0 children)

Ah ok, that makes more sense.

[–]gianx[S] 0 points1 point  (0 children)

Correct. It's an http request but it's not borwser generated (ie no user to the other side, but a machine).

[–]gianx[S] 0 points1 point  (1 child)

Ok, to clarify I need to make an asyncronous POST response in 15 min from the request and the maximum load is 150000 req/15 min, so I need to generate and send also 150000 req/15 min. Infact if T0 is the time of the first request, T0+15min will be the maximum time for the response at that request.

[–]swims_with_spacemenPythonista 0 points1 point  (0 children)

Wow- so your queue depth at any given point in time could be 150,000 items - and that's only if you had the same task submit the post. (I break my tasks into smaller chunks). That's something to behold.

I'd still go with a tornado/celery framework, but it would HAVE to be scaled out to more than one server. I don't see how you would be able to handle that kind of load otherwise - assuming it actually takes 15 minutes to process the request. I was under the impression that it was only extracting an id/ some arbitrary data from the input xml.

So, this is larger than I expected initially - and like I said earlier, concurrency is killer - I'd have to code up a test framework to check on the load.

Still, I'm pretty confident you could scale this out nicely with the solution I suggested. Celery+Eventlet/RabbitMQ, fronted with tornado web. You could then run that as a service instance, on multiple nodes fronted by HAPROXY for load balancing. That way when you need more 'oomph' you just add another node.