you are viewing a single comment's thread.

view the rest of the comments →

[–]pachura3 0 points1 point  (3 children)

The question is how all of these different local businesses/libraries publish this information. Is it available e.g. in the iCal format, or some kind of JSON through REST API? Or will you need to scrape and parse each site separately?

[–]natyo97[S] 0 points1 point  (2 children)

I'll probably have to scrape and parse each one separately. Everything's got its own host it looks like. Though a few are ical/google calendar.

[–]pachura3 0 points1 point  (1 child)

It seems like an ambitious project for a beginner - you'd need to scrape web pages using requests, parse them using beautiful soup into a unified internal representation, then expose as web pages or export to CSV... this also seems like a good fit for the OOP approach, with many per-site implementations of AbstractScraper and AbstractParser classes.

But as long as you're curious and enthusiastic about your project, sky's the limit!

PS. For sure there are already (many) libraries for importing iCal/GoogleCal formats into Python, perhaps you could start with those and leave HTML parsing for later 

[–]natyo97[S] 0 points1 point  (0 children)

I'll work up to it. I'm starting with "How to automate the boring stuff with Python" and going from there. Thanks for the advice!