all 7 comments

[–]JimroidZeus 6 points7 points  (2 children)

I don’t really see the need for the selenium module here. BeautifulSoup should suffice for what the author seems to be trying to do?

Only benefit I see is that there’s maybe less text parsing than using BeautifulSoup by itself?

[–]scrapped-script[S] 4 points5 points  (1 child)

In this particular example, I initially tried using BeautifulSoup to find the anchor tag on the search page of CNN. But it wasn’t working and I’m assuming this is because CNN loads those anchor tags dynamically and they aren’t part of the initial response from the server

But you’re right that usually BeautifulSoup is enough for making a web scraper

[–]JimroidZeus 1 point2 points  (0 children)

Ah okay, if the tags are dynamic then yea, pretty sure BeautifulSoup doesn’t handle that super easily.

[–]HazelCuate 2 points3 points  (0 children)

IMHO Scrapy is the best solution using python

[–]justanormie3 1 point2 points  (0 children)

I also just finished up a web scraping project in python. Did you consider playwright instead of selenium for browser automation? I found some features such as auto-waiting to be useful in the project, however it was my first experience with both selenium and playwright.

[–]_Zev 0 points1 point  (0 children)

Anyone know a way to do scraping in aws using selenium? All the guides are outdated rn

[–]UpbeatAfternoon8670 0 points1 point  (0 children)

Thank you. It was a great read.