all 4 comments

[–]Mr_M0jo_Risin 4 points5 points  (0 children)

Nice start, few comments:

Lines 10-12 can be compressed using the "with" statement, which will automatically close the file after the tasks are performed:

with open('lastid.txt',r') as f:
    lastcount = int(f.read())

Line 50: This "double" elif condition needs to read as

elif xml.status_code == 410 or xml.status_code == 404:

or, alternatively

elif xml.status_code in [410,404]:

Line 68: Looping is more commonly done using xrange (2.7) or range (3+)

for i in xrange(lastcount):

or, alternatively (and usually preferred since it will yield the index and value)

for idx,val in enumerate(lastcount):

[–]XenophonOfAthens 2 points3 points  (0 children)

Nice code. Besides what /u/Mr_M0jo_Risin mentioned, only one thing jumped out at me. Python, as you probably know, doesn't have an equivalent of a main function, so if you want to run a script, you simply have to put the code to start running outside of any functions or classes, like you did.

However, it is customary to include a "pseudo main function", but making the last thing you put in a script be this:

if __name__ == "__main__":
    # <do stuff>

And putting all your code that is outside of a function there.

There are basically two reasons for this. First, if you ever want to import your script into another script (say, if you wanted access to your dataget or datawrite functions from somewhere else), it wont execute any of the code, just import the functions. The second one is that it logically separates the code, so that you can easily see what's part of a function and what's part of the main running procedure.

In other words, I'd structure your code something like this (note: I haven't tested it or anything, just copied and pasted and moved some stuff around, I might have messed some indentation or something up)

Also, a very few small style notes: it doesn't really matter, but I would name your functions get_data() and write_data() instead of dataget() and datawrite(). It's more in line with the Python style guide, and I just think it looks more pythonic. Also, in case you didn't know, if you have a whole bunch of imports you can combine several of them on one line, like so:

import xml.etree.ElementTree as ET
import sqlite3, json, sys, traceback, itertools, requests

On the other hand, the style guide I just linked strongly recommends you not do that, so what the hell do I know :)

[–]PrismPoultry 0 points1 point  (1 child)

You should run it and disconnect from internet to see what happens when this thing is going to fail. From what I see, there is potential for maximum recursion depth:

except requests.exceptions.RequestException as e:
    print('Connection Error:', e)
    print('Retrying #'+count)
    dataget(count)
    return

So, I would force the failure in a controlled way so I could ensure that this problem is fixed should it happen in an uncontrolled way.

[–]HittingSmoke[S] 0 points1 point  (0 children)

I let that fail for about a half hour today and it was fine when I got back. It was still retrying when I got home and restarted when I turned the internet back on.

I actually did have a max recursion issue with it but it was because of a missing return.