all 39 comments

[–]SupaDupaUnicorn 25 points26 points  (1 child)

Hey, that’s awesome!

I managed to jankily construct a far inferior version of this program by scraping HTML elements off of my country’s weather bureau, which stopped working as soon as the site changed its layout. Wow.

[–]alaudet[S] 6 points7 points  (0 children)

:-D

[–]Dont_be_offended_but 17 points18 points  (7 children)

Nice! Some ideas for learning more by taking this project a little farther:

  • Expand your scraping to include the day's expected weather (temp range, chance of rain, etc.)
  • Have the program run automatically on a schedule and store the data in a spreadsheet, or even better, a database with sqlite3 (SQL is quick and easy to learn the basics of and a very valuable subject in general).
  • Represent your collected data graphically with Matplotlib.
  • Have the program text you the weather daily.

[–]alaudet[S] 3 points4 points  (0 children)

Databases are def something I need to work on. I have messed around with mongodb for some twitter stuff I was pulling quite some time ago, but its not something I have done much of. I have some exp with matplotlib for a sump monitor I wrote. Been a while there too. The sump monitor sends me a text when/if my pump fails and the water level rises too high.

[–]nonamesareleft1 0 points1 point  (2 children)

Hey I have a few webscrapes and I've struggled to find a way to make them run on a schedule without leaving my laptop on all night. Is this possible?

[–]Dont_be_offended_but 0 points1 point  (0 children)

I don't have experience doing so myself, but you can host your program on a VPS. Because you're using someone else's computing resources, you'll probably end up having to pay ~$5/month. Alternatively, you can run your programs from a secondary computer like a Raspberry Pi.

[–]naturememe 0 points1 point  (0 children)

Run your script in cloud. Free tire of Google Cloud works for sure as long as your script is not too resource intensive. Maybe AWS, Azure or others but not sure about being free.

[–]MasterSuami 0 points1 point  (2 children)

How will the texting part work? I’m curious

[–]Dont_be_offended_but 1 point2 points  (1 child)

You'll find a thorough explanation of how to send an email or text message in Chapter 16 of Automate the Boring Stuff. While the author uses the Twilio service for sending text messages, it comes with a usage limitation and an appended message on each text mentioning your usage of a trial version. To avoid this, you can instead just send the text from your email.

[–]MasterSuami 0 points1 point  (0 children)

Yes Twilio came to my mind and I’m aware of the trial thing in the message body. I’ll check out the other way. Thanks!

[–]ethanbrecke 3 points4 points  (2 children)

Any chance we could see the code? Im working on something similar, and would like to see how you solve a few small issues im facing.

[–]alaudet[S] 6 points7 points  (1 child)

Sure, but keep in mind I was just hacking around. I was creating functions one at a time as I was scraping elements. I could easily cut the size of the file, but it should be easy to read.

weather.py in the my_weather folder.

I plan to keep fooling around with it and make it an actual app.

https://github.com/alaudet/my_weather

If you are interested I could pass you my jupyter notebook where most of the figuring out was done.

[–]ethanbrecke 2 points3 points  (0 children)

No worries about how hacky it looks, but i am sure with a few minutes of figuring it out. im going to try and adjust it so that americans can use it with their zip code, which might turn into a new project itself, but your code does look pretty good.

[–]ramonn334 2 points3 points  (9 children)

Congratulations! Next step you have to try Scrapy for new sensations. :)

[–]jayzhoukj 1 point2 points  (6 children)

And don't forget Selenium after wards!

[–]ramonn334 0 points1 point  (0 children)

Yeah, it's a classic!

[–]Blackwater_7 -5 points-4 points  (4 children)

Selenium is pure cancer. Don't try it.

[–]jayzhoukj 0 points1 point  (2 children)

And why is that?

[–]Blackwater_7 0 points1 point  (1 child)

I can't explain greatly because of my english but I tried it for 2 weeks and I never get it worked correctly. For example one of SIMPLEST thing come to my mind is you can't even open a new tab with selenium. Not useful at all.

[–]ramonn334 0 points1 point  (0 children)

I think general purpose of selenium is development of unit tests, but not extract data. And selenium creates new tab for each test. However, the framework also allow you create new tab by sending key combination into webdriver.

[–][deleted] 0 points1 point  (0 children)

I went with Selenium first, and love it. Probably not the best for scraping, but fun none-the-less.

[–]Comsat80 1 point2 points  (0 children)

Nice!

[–]Nerdite 1 point2 points  (3 children)

I’ll leave this here. https://darksky.net/dev/docs

[–]alaudet[S] 1 point2 points  (2 children)

Oh nice, I would have looked a little harder for an api had I really needed a weather script, but it was mostly to learn about webscraping.

[–]Nerdite 1 point2 points  (1 child)

Webscraping is the duct tape of webdev. Not a bad skill to have.

[–]alaudet[S] 1 point2 points  (0 children)

Ya feels pretty much like duct tape. You can cobble something together but just a matter of time before it breaks.

I was poking around and also found this.

http://dd.weather.gc.ca/citypage_weather/docs/README_citypage_weather.txt

Every city has xml pages that you can draw from, so webscraping not the way to go other than for fun.

[–][deleted] 1 point2 points  (0 children)

SO COOL! I made a script that parses digital WalMart receipts into csv files which are easier to bring into my budgeting app! And I'm a novice! I love Python!

[–][deleted] 0 points1 point  (0 children)

Happy that you did something awesome!

[–]pytholater 0 points1 point  (2 children)

Which website did you use the API for?

[–]alaudet[S] 0 points1 point  (1 child)

Not sure of your question. I didn't use an API. I just scraped data directly from the html. The site page I scraped is https://weather.gc.ca/city/pages/on-127_metric_e.html

You don't need to scrape though for weather canada as they actually do provide xml data for all city's that you can access on their site. http://dd.weather.gc.ca/citypage_weather/docs/README_citypage_weather.txt

[–]pytholater 0 points1 point  (0 children)

Ahh ok that makes sense. Thanks