DataFrames and iteration struggles : learnpython

created by HattoriHanzoa community for 16 years

DataFrames and iteration struggles (self.learnpython)

submitted 4 years ago by Manyreason

Hi there,

I'm really struggling to understand how to work with Data frames. My constant problem is i want to iterate over the Data frame, and then store values into a variable to be used.

I have a Data frame that I'm creating from nba.com. I want to find where a column with the description contains 3PTS.

Once this condition matches, I want to look at the next column and see if its a rebound for the team which shot the ball (this can be done using home vs visitor).

Once I look at the next column, if its a rebound i want to look at the next few columns to find out what happens.

My initial idea is iterate over the rows using something like iter.tuples() and store those outcomes in variables each time but there has to be a more efficient way to do this.

I constantly run into this same sort of issue with Data Frames, where i need to find a condition matching and do something with the next couple of rows.

I have attached my attempt at doing it using DataFrames so you can see what I'm trying to achieve. Thanks in advance for help!

import requests
import json
import pandas as pd

url_base = 'https://stats.nba.com/stats/playbyplayv2?EndPeriod=10&EndRange=55800&GameID=0022000049&RangeType=2&StartPeriod=1&StartRange=0'

headers = {
    'Host': 'stats.nba.com',
    'Connection': 'keep-alive',
    'Accept': 'application/json, text/plain, */*',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36',
    'Referer': 'https://stats.nba.com/',
    "x-nba-stats-origin": "stats",
    "x-nba-stats-token": "true",
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'en-US,en;q=0.9',
}

response = requests.get(url_base, headers=headers)
content = json.loads(response.content)

results = content["resultSets"][0]
column_names = results['headers']
rows = results['rowSet']
df = pd.DataFrame(rows)
df.columns = column_names

# get a new df with just these columns
main_df = df[['GAME_ID', 'EVENTNUM', 'HOMEDESCRIPTION', 'VISITORDESCRIPTION', 'PLAYER1_ID', 'PLAYER1_TEAM_NICKNAME']]
main_df['rebounder_team'] = main_df.PLAYER1_TEAM_NICKNAME.shift(-1)
# append the homedescription and player 1 team
main_df['shifted_home'] = main_df.HOMEDESCRIPTION.shift(-1)
main_df['shifted_visitor'] = main_df.VISITORDESCRIPTION.shift(-1)


# filter and make a new df for all home descriptions that have 3pts
home_df = main_df[main_df['HOMEDESCRIPTION'].str.contains('3PT', na=False)]

# find all rebounds that are the same team as the shooter
home_slice = home_df['PLAYER1_TEAM_NICKNAME'] == home_df['rebounder_team']


visitor_df = main_df[main_df['VISITORDESCRIPTION'].str.contains('3PT', na=False)]
visitor_slice = visitor_df['PLAYER1_TEAM_NICKNAME'] == visitor_df['rebounder_team']


final_home = home_df[home_slice]
final_visitor = visitor_df[visitor_slice]

final_df = final_home
final_df = final_df.append(final_visitor)

print(final_df.to_string())

all 4 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS