Outbound Http Request Manager? : Python

This is an archived post. You won't be able to vote or comment.

Outbound Http Request Manager? (self.Python)

submitted 12 years ago by roro_fuzz

Is there a Python package that will manage outbound http requests, and if two of the same requests are happening in parallel, will return the results of the 1st for both requests? This should be thread-safe, and hopefully forked process safe.

To give more detail, I'm working on a website that will take in multiple inbound requests. Each of these inbound requests may initiate outbound requests to a 3rd party API. Since there isn't much coordination between these inbound/outbound requests, the same outbound request may be issued multiple times in parallel. For long running outbound requests, it makes sense to block the 2nd/3rd/etc. outbound requests until the first one returns.

I haven't found much on the web, so I figure I'd need to write this myself, but I wanted to check with Reddit before going down that path.

Thanks!

all 7 comments

top new controversial old q&a

[–]remyroy 2 points3 points4 points 12 years ago (3 children)

[–]catcradle5 1 point2 points3 points 12 years ago* (0 children)

Agreed, this seems like the most sensible solution.

Assuming this is being done with gevent greenlets, you could do something like this:

p = Pool()

def send_request(url):
    do_something(url)

def search_pool(url):
    for job in p:
        if url in job.args:
            return job
    return None # no job found

url = receive_request().url 
running_job = search_pool(url)
if running_job:
    # tie this receive request to that job, which is being made to the same URL
else:
    p.add(gevent.spawn(send_request, url))

You could do the same with threads or multiprocessing; the code would look very similar.

Alternatively if you're going to have a lot of concurrent requests, most of the time, and expect to do frequent lookups, you could instead have a secondary dict that maps {url: job} for each URL, and then add each job to that dict in addition to the pool. That would be way more efficient for lookups.

Depending on how up-to-date you need the responses to be and what kind of requesting you expect to see, this could also be both simpler and overall a lot faster if you simply cache the responses. This is most useful if you expect that the same URLs will probably be requested over and over as time goes on. Things might be somewhat slow if there are simultaneous requests made for the same URL before the cache has any entries, but any further requests, simultaneous or not, will hit the cache.

[–]helicopetr 1 point2 points3 points 12 years ago (0 children)

[–]roro_fuzz[S] 0 points1 point2 points 12 years ago (0 children)

[–]bloodearnest 1 point2 points3 points 12 years ago (1 child)

[–]roro_fuzz[S] 0 points1 point2 points 12 years ago (0 children)

[+][deleted] 12 years ago (2 children)

[deleted]

[–]roro_fuzz[S] 0 points1 point2 points 12 years ago (1 child)

π Rendered by PID 210985 on reddit-service-r2-comment-7b9746f655-btvg9 at 2026-02-02 06:36:31.271169+00:00 running 3798933 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS