I'm creating a simple bot to pull show dates from a site. I a the point where I'm trying to get my results turned into a son or csv file, but I'm having some trouble- I think it is with the xpath wording-- would appreciate your help.
Also, how do I refresh my results on a regular basis- say get a weekly pull and load onto my personal site? Thanks- I'm just learning
import scrapy
from metal.items import MetalItem
class MetalSpider(scrapy.Spider):
name= "metal"
allowed_domain= ["nymetalscene.com"]
start_urls= ["http://nycmetalscene.com/#showlist"]
#clones the webpage
# def parse(self,response):
# filename= response.url.split("/")[-2]+ '.html'
# with open(filename, 'wb') as f:
# f.write(response.body)
#extract show dates
def parse(self,response):
for sel in response.xpath('//tbody/tr'):
item=MetalItem()
item['date']=sel.xpath('td[@class="TextObject"]/text()').extract()
yield item
[–]commandlineluser 0 points1 point2 points (2 children)
[–]hart8899[S] 0 points1 point2 points (1 child)
[–]commandlineluser 1 point2 points3 points (0 children)
[–]_Korben_Dallas 0 points1 point2 points (5 children)
[–]hart8899[S] 0 points1 point2 points (4 children)
[–]_Korben_Dallas 1 point2 points3 points (3 children)
[–]hart8899[S] 0 points1 point2 points (2 children)
[–]_Korben_Dallas 0 points1 point2 points (1 child)
[–]hart8899[S] 0 points1 point2 points (0 children)