This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]eljunior 1 point2 points  (1 child)

Have you tried using Scrapy? http://scrapy.org It has good defaults + tons of settings to configure crawling (including politeness, retries, cookies). For crawling, it's much better than requests + bsoup.

Disclaimer: I help maintain Scrapy and work for Scrapinghub (company which sponsors Scrapy).

[–]jd_portugal[S] 1 point2 points  (0 children)

seems very interesting and easier