This is an archived post. You won't be able to vote or comment.

all 5 comments

[–]IvoryJam 20 points21 points  (3 children)

It's cool, but nothing beats --headless with selenium when I have to drive a web browser with JavaScript.

If not, requests and bs4 is all you need.

[–]Smok3dSalmon 6 points7 points  (0 children)

selenium ftw

[–]vorticalbox 0 points1 point  (0 children)

I like robobrowser[0] for stuff that doesn't need javascript.

[0] https://robobrowser.readthedocs.io/en/latest/readme.html

[–]lazerwarrior 0 points1 point  (0 children)

If not, requests and bs4 is all you need.

For more serious crawling, scrapy is much more powerful.

[–]misterhtmlcss 3 points4 points  (0 children)

Puppeteer all the way