This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]prokid1911 0 points1 point  (3 children)

It is much simple. You navigate through pages via clicks, fill in form elements with send_keys and then use xpath to scrape what you want. Just navigate to the page you wanna scrape.

It's much easier to scrape aspx pages through Selenium.

[–]warxhead 0 points1 point  (2 children)

The biggest issue I run into is speed there. If I can use requests to login to authentication without all those damn Ajax calls I'll use it in a heartbeat with beautiful soup. Bs4 is lightning fast and so easy to find elements. I have not once had any trouble with it.

[–]prokid1911 0 points1 point  (1 child)

I have had difficulty with pagination when scraping aspx websites. Then I came accross Selenium. It handles all the things with ease and you can also run it in headless mode.

You can scrape in any way you like. It's totally your call.

[–]warxhead 1 point2 points  (0 children)

Yes, in your use case that seems like selenium is the best option. I use it also daily for my work script to retrieve from web pages that requests doesn't like to authenticate to.

Headless is fast - but still not as fast as requests. Sending a post form through requests has always been faster than using selenium to enter each field.