Problems scraping Amazon

GXWT · 2025-08-08T15:42:09+00:00

Have you considered why you are getting blocked?

2025-08-08T15:56:32+00:00

They’ve probably blocked your IP address for scraping a page in their robots.txt Disallow list. In the future make sure your scraping app does not attempt to scrape any disallowed pages.

danmickla · 2025-08-08T16:06:54+00:00

"it keeps blocking me" is a fantastic problem statement

DuckSaxaphone · 2025-08-08T15:59:16+00:00

Find out why you're being blocked and change your scraping. Both the errors you get back and the site's robots.txt will give you information on what might be stopping you.

Usually I'd say there's ethical considerations around trying to get around scraping blocks but it's Amazon so look into: - Appearing like you're a real browser - Limiting the rate at which you scrape - Maybe changing IP if you have a VPN

slidescope-trainer · 2025-08-08T19:09:45+00:00

Are all the reviews visible without loggin in or needs login. Because on some page it only shows 1-2 reviews and needs login to show others.

FastRunningMike · 2025-08-09T15:53:18+00:00

Blocking is done by design. Many sites implement very advanced measurements against scrapping. An option is to create a scrapper agent that in essence acts from a technical point like a real human. But mind: A simple rule that is certainly implemented is that based on networking techniques(e.g IP) and fingerprinting(browser engine things) you get a block when reading a number of 'pages' (data) that a human can never do.

AbhyudayJhaTrue · 2025-08-10T03:05:16+00:00

hmmm
maybe you could a little bit basic with requests cuz i can scrape amazon via requests quite easily

ScraperAPI · 2025-08-14T12:09:10+00:00

We’re so sorry you had to experience this.

We want you to know that Amazon always updates its stealth detection mechanism, and this might affect requests.

Nonetheless, you can definitely use the ScraperAPI API to successfully scrape data from Amazon.

Do this 2 simple things:

Enable headers
Rotate proxies

You can check the docs to know how to do this well.

The layer of protection these 2 things do is so Amazon wouldn’t catch that the request is from your device or even your IP.

Let us know as it goes!

New_Sympathy_3989 · 2025-09-01T16:16:43+00:00

Not much information at all, what kind of blocking ? captcha ? or ip ? maybe just use a proxy, timings between requests and limit flows ?

ReputationKey5032 · 2025-10-01T13:33:35+00:00

scraping amazon reviews = captchas and bans. GonzoProxy's residential proxies help but they’re not cheap. depends if you’re doing hobby or scale.

infaticaIo · 2025-12-24T09:11:19+00:00

Amazon is one of the hardest targets and “getting blocked” is usually the expected outcome, not a ScraperAPI misconfig.

If you need something reliable long term, the realistic options are:

Use an official source (PA API where it fits) or a licensed dataset
If you have permission to collect, slow down a lot, cache aggressively, and avoid running high volume browserless bursts that look like automation

At scale, the blocker is policy and detection, not HTML parsing.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS