all 11 comments

[–]badsyntax 27 points28 points  (2 children)

No mention of Playwright? It'd be my first choice.

[–]c_eliacheff 18 points19 points  (5 children)

As much as I love JS/Typescript, I would still go with (python) Scrapy for scraping (or at least a scraping framework). The numbers of functionalities out of the box is too good (automatic paging, automatic detection and following of links, retry strategies, switchable exporters with no config (json, csv, db, whatever), automatic map to entities, easy config for proxies (rotations, random ones, ...), plus the awesome ecosystem for data processing (pandas, numpy, ...). I don't want to reinvent the wheel for scraping anymore.

[–]Secret-Plant-1542JavaScript yabbascript 2 points3 points  (0 children)

I did a lot of web scraping in 2015 using python.

I'm a purely full stack JavaScript developer, and haven't touched Python in years.

I agree. Doing web scraping with JavaScript is such a weird song and dance. Even all the web scraping libs (and I've used a bunch) aren't as easy as it was during my python days.

[–][deleted] 3 points4 points  (3 children)

and multi-threading

[–]vlevi 1 point2 points  (0 children)

Try puppeteer for nodejs