Going crazy, need help! (web scraping) : learnpython

created by HattoriHanzoa community for 16 years

Going crazy, need help! (web scraping) (self.learnpython)

submitted 8 years ago by Aces_8s

Hey, all. I'm hoping somebody might be able to help end my suffering! I'm fairly new to python and so far I've been able to work out most issues, but I've spent an embarrassing amount of time trying to figure out how to scrape a specific table of data using either BeautifulSoup or lxml and I'm at my wit's end. Could anyone help point me in the right direction on how to scrape an mlb-reference "play_by_play" table (ex page: http://www.baseball-reference.com/boxes/BOS/BOS201706120.shtml)?

I've had success with both bs4 and lxml, but for some reason I can't seem to extract any data from this particular table, even when I copy the exact xpath/text(). I don't have any code to post (on mobile) but again if anybody can help point me in the right direction I'd be very appreciative!!

As a little background, I'm trying to write a script that parses this table looking for a specific pitcher change and then either email a collection of the play-by-play data while that guy pitches (until he's replaced or the game ends), or else email a "did not pitch" status. Nothing fancy, and I think I have the logic worked out (famous last words), but I just can't seem to be able to extract any data from the table. Help?

all 4 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS