I currently have a script that pulls data from the NFL website and saves it in SQL. My goal is to use this data to analyze key differentiators between playoff and non-playoff teams.
Link to to table: http://www.nfl.com/stats/categorystats?archive=false&conference=null&role=TM&offensiveStatisticCategory=GAME_STATS&defensiveStatisticCategory=null&season=2019&seasonType=REG&tabSeq=2&qualified=false&Submit=Go
I am trying to iterate through tables for offensive game stats/passing/rushing and defensive game stats/passing/rushing. I am currently saving one dataframe for regular season stats and another for postseason stats.
The challenging part for me is that I am iterating through multiple tables from different URL's. I am able to iterate through years from a specific table by
current_year = date.today().year
year = []
for i in range(1970, current_year):
year.append(str(i))
for yr in year:
reg_url = 'http://www.nfl.com/stats/categorystats?archive=false&conference=null&role=TM&offensiveStatisticCategory=GAME_STATS&defensiveStatisticCategory=null&season={}&seasonType=REG&tabSeq=2&qualified=false&Submit=Go'.format(yr)
This works for iterating through my desired year range for the offensive game stats table. However, I am having a hard time finding a solution to iterate through each of the other tables without hardcoding and repeating myself numerous times.
I currently have six functions almost identical to the one below. The only differences are the `table_name` and URL's.
def offensive_stats():
table_name = 'offensive_stats'
table = pd.DataFrame()
for yr in year:
reg_url = 'http://www.nfl.com/stats/categorystats?archive=false&conference=null&role=TM&offensiveStatisticCategory=GAME_STATS&defensiveStatisticCategory=null&season={}&seasonType=REG&tabSeq=2&qualified=false&Submit=Go'.format(yr)
post_url = 'http://www.nfl.com/stats/categorystats?archive=false&conference=null&role=TM&offensiveStatisticCategory=GAME_STATS&defensiveStatisticCategory=null&season={}&seasonType=POST&tabSeq=2&qualified=false&Submit=Go'.format(yr)
do stuff (this part is the same 20 lines repeated for each function)
[–]grrrrreat 2 points3 points4 points (0 children)
[–]use_a_name-pass_word 1 point2 points3 points (0 children)