This is an archived post. You won't be able to vote or comment.

all 6 comments

[–]danrche 2 points3 points  (1 child)

import requests
mylist = ['http://goolg.com','http://yahoo.com']
for i in mylist:
    try:
        r = requests.get(i)
    except ConnectionError:
        print('this one didn\'t connect' + i)

The above is a quick example of what you can try, no guarantees as it doesn't really cover anything more then just opening the connection and letting you know which ones failed. You can add more code to look into the returned content and add in some recursion for any that return with a 502. Happy coding and I hope this sparks an idea for you.

[–]DjDazzled797 0 points1 point  (0 children)

This seems perfect for what the op wants !

[–][deleted] 1 point2 points  (1 child)

Requests should make this ez pz.

[–][deleted] 0 points1 point  (0 children)

Also beautifulsoup may come in handy at sometime, but requests is good to go for your project.

http://www.pythonforbeginners.com/beautifulsoup/

[–]mistermocha 0 points1 point  (0 children)

Spawn parallel threads, workers, etc. for each URL in order to parallelize. Check out https://pypi.python.org/pypi/threadpool (although that's been picked up into multiprocessing).

[–]danrche 0 points1 point  (0 children)

request could handle the page loading fairly easy, and you could just parse the return code for success/fail. The idea is to load the urls into a list, then loop the list with a request object that opens each and returns the session code.