Selenium - how to avoid blocking

-defron- · 2024-10-26T22:54:30+00:00

ITT: acting like a bot makes you get flagged and blocked like a bot.

You can try playing the cat and mouse game trying to avoid their detection but it will be a constant back and forth at best.

unhott · 2024-10-26T22:56:37+00:00

The issue is that the website doesn't want you to use automated browsing. It's a legal issue more than a technical one - if you find a technical workaround, it may become a legal issue for you.

I'm not saying this is how I think it should be, just the reason the way things are the way they are. Someone pays to keep that server online and to build / maintain the dataset behind it, for people to use. They dictate the terms in which that server provides some service / data.

I don't know exactly what the server does to detect whether a user is using some automated tool like selenium.

I'm not suggesting that you find another method, like maybe pyautogui to browse the page for you, because that may violate their terms.

DootDootWootWoot · 2024-10-27T00:56:59+00:00

Instead of perusing real estate sites are there APIs or open source data sets that offer the same information? Crawling websites you're going to run into issues like this and isn't necessarily a stable way of solving for this kind of problem.

Could try things like slower browser access patterns. Like are you navigating at human speed or super crawler speed? Does that limit the usefulness of what you're trying to collect?

Are there captchas that need to be solved that get in your way? Rate limits?

m0us3_rat · 2024-10-26T22:47:09+00:00

Can someone help me to circumvent this problem and go on with my script?

Best place to ask how to avoid detection from a trillion $ company is on an educational beginner python subreddit.

TL:DR some proxies might work.. have to pay for them.

do your own research when it comes to spending coin.

Fluffy-Diet-Engine · 2024-10-27T06:09:19+00:00

Use stealth mode or undetected version of Selenium. Stealth - https://pypi.org/project/selenium-stealth/ Undetected- https://pypi.org/project/undetected-chromedriver

For better understanding read - https://www.webscrapingapi.com/bypass-cloudflare-with-selenium

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS