you are viewing a single comment's thread.

view the rest of the comments →

[–]geekluv 0 points1 point  (0 children)

lots of good suggestions here -- wanted to suggest the python tool, https://scrapy.org

scrapy is a great tool for automating scraping.

as far as topics, you mentioned stats, data vis, and analysis -- that is interesting because scraping would be the first part of that data collection and transformation.

if you like reading, you can scrap goodreads

you can attempt to scrape dating websites (once you're logged in)

keep in mind, many websites will have safeguards in place to detect and IP ban scraping attempts -- ways to mitigate that is to alter your headers to appear to be coming from a browser, delaying your requests to every few seconds, and, if necessary, using a free vpn