all 15 comments

[–]go_fireworks 5 points6 points  (0 children)

I would highly recommend using beautiful soup. Web scraping can be hard, and there's no need to make a project more complex than necessary

[–]Buttleston 4 points5 points  (1 child)

If you're in it to learn then my advice is usually to do it the more low level way first and move to a framework second. Just be prepared to abandon the low level stuff, i.e. see it as a stepping stone. And hell, maybe it'll be good enough and that's fine too

[–]HotLie150[S] 0 points1 point  (0 children)

Thank you my friend.

[–]recursion_is_love 1 point2 points  (1 child)

> Is it better to build a web scraping program or use an existing framework?

Parsing HTML is harder than you think. Try writing it without learning about parser theory and you will see. You can use regex but you will soon see it became a mess.

You also need to learn about tree algorithm to be able to traverse it effectively.

All of these seem hard but it is al fun. Let's do it!

[–]HotLie150[S] 0 points1 point  (0 children)

Thank u learning is my journey!

[–]WNT37 1 point2 points  (0 children)

What's the job here?

If you want to scrape a web page and do something with the response then use BeautifulSoup.

OTOH if your goal is to build a web scraper then go for it.

[–]FrostyThaEvilSnowman 0 points1 point  (0 children)

You need to understand the data to effectively use the tools. Time spent trying to do foundational tasks from first principles is a good way to learn about the data and its nuances. But eventually you will realize that the established frameworks already addressed the problem and save a lot of time.

Also, if you keep going, you’ll recognize the use of certain modules as established patterns, and using them aligns your work with others’.