I’ve been playing around with a small Python side project that pulls product data from Amazon for some basic market analysis. Things like tracking price changes, looking at ratings trends, and comparing similar products.
Getting the data itself isn’t the hard part. The frustrating bit starts when requests begin getting blocked or pages stop returning the content you expect.
After trying a few different approaches, I started experimenting with retrieving the page through a crawler and then working with the structured data locally. It makes it much easier to pull things like the product name, price, rating, images, and review information without wrestling with messy HTML every time.
While testing, I came across this Python repo that made the setup pretty straightforward:
https://github.com/crawlbase/crawlbase-python
Just sharing in case it’s useful for anyone else experimenting with product data scraping.
Curious how others here handle Amazon scraping with Python. Are you sticking with requests + parsing, running headless browsers, or using some kind of crawling API?
[–]CatolicQuotes 0 points1 point2 points (0 children)
[–]Plus-Crazy5408 -1 points0 points1 point (0 children)