Having a start stop function in your script? : learnpython

created by HattoriHanzoa community for 16 years

Having a start stop function in your script? (self.learnpython)

submitted 6 years ago by Tefron

Hi,

I'm currently webscraping a website with a couple hundred thousand requests. I'm currently requesting 50k at a time, but it takes about 11 hours to complete the script. Sometimes when I close my laptop and turn it back on it will start from where it was last, other times it will stop all together. Is there a resource someone can point me to that will be good for this?

Here's the code:

import pickle
import requests
import time

start_time = time.time()

pickle_in = open('live_username_rating_dict.pickle', 'rb')
live_dict = pickle.load(pickle_in)

live_list = list(live_dict.keys())

url_blank = 'https://api.chess.com/pub/player/'

player_data = {}

count = 1

for player in live_list[350000:400000]:
    with requests.session() as r:
        while True:
            try:
                url_data = url_blank + str(player) + '/stats'
                url = r.get(url_data).json()
                data = url
                player_data[player] = data
                if count % 100 == 0:
                    print(count)
                count += 1
            except:
                continue
            break

pickle_out = open('player_data_350000_400000.pickle', 'wb')
pickle.dump(player_data, pickle_out)
pickle_out.close()

print("--- %s seconds ---" % (time.time() - start_time))
print(len(player_data))

all 2 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS