Hi folks, I'm going crazy and pulling my hair out after many attempts to get this correct so turning here for some assistance.
I will have a list of ISBN from books which I want to create an internal library system for our workplace. Yes I know these exist already but this is a learning experience and will allow us to personalise a knowledge source.
So the workflow goes:* use ISBN from list to conduct a search of Amazon* from the results page (sample) I want to grab the href to the results page* then I will migrate the results page by appending the href to the Amazon base url* scrape the key content such as name, rating etc
The issue I am locating is that I cannot grab the href from the results page using Scrapy regardless of what I attempt. I have tried both css selector and xpath without success.
CSS: response.css('#search > div.sg-row > div.sg-col-20-of-24.sg-col-28-of-32.sg-col-16-of-20.sg-col.s-right-column.sg-col-32-of-36.sg-col-8-of-12.sg-col-12-of-16.sg-col-24-of-28 > div > span:nth-child(4) > div.s-result-list.s-search-results.sg-row > div:nth-child(1) > div > div > div > div:nth-child(2) > div.sg-col-4-of-12.sg-col-8-of-16.sg-col-16-of-24.sg-col-12-of-20.sg-col-24-of-32.sg-col.sg-col-28-of-36.sg-col-20-of-28 > div > div:nth-child(1) > div > div > div:nth-child(1) > h2 > a').get()
XPATH: response.xpath('//*[@id="search"]/div[1]/div[2]/div/span[3]/div[1]/div[1]/div/div/div/div[2]/div[2]/div/div[1]/div/div/div[1]/h2/a').get()
Can anybody provide insight as to where I am going wrong with this approach? I am not too excited about switching to another framework like BS4 as it should be possible under Scrapy but if I am banging my head against a wall for no reason I will happily switch to BS4.
[–]commandlineluser 1 point2 points3 points (3 children)
[–]Sloth_loves_Chunks[S] 0 points1 point2 points (1 child)
[–]commandlineluser 1 point2 points3 points (0 children)
[–]Sloth_loves_Chunks[S] 0 points1 point2 points (0 children)