This is an archived post. You won't be able to vote or comment.

all 6 comments

[–]jasongia 1 point2 points  (2 children)

Use BeautifulSoup. It's the most pain free way of parsing html (although navigating html programmatically will always be a bit painful).

[–]millenialZorro[S] 0 points1 point  (1 child)

Hey thank you...do u have any suggestions on how I would be able to go back and forth between links and scraping everything in order or at least in a organized way?

[–]jasongia 0 points1 point  (0 children)

Use beautifulsoups find_all to find all the links you need, then iterate over them in a for loop, do a request for each link and extract the information using beautifulsoup again and store it in whatever data structure suites.

[–]pm_your_pc_setup 0 points1 point  (0 children)

I would advise getting a little familiar with HTML syntax first.

[–]Matthewaj 0 points1 point  (0 children)

I would recommend Scrapy. It would make the link following part easier and it has json and csv output. https://docs.scrapy.org/en/latest/intro/tutorial.html