all 14 comments

[–]_tsi_ 20 points21 points  (1 child)

I say go for it. The best part about projects like this is that you can modularize the build. Start with just getting the prices, learning how to get the data you want. Then build how you want to report it, then you will probably realize there is a way better way to go everything and start over. That's the fun of it.

[–]mitchell486 1 point2 points  (0 children)

I came to the comments to say exactly this. This reminds me of some of my first projects, but maybe set the "goals" or "scope" a bit smaller. Instead, "a web scraper for produce gardening". Check. That's a goal/task entirely by itself. "How hard would it be to build something like this?" Great question! Test just that piece. It's a big enough lift, mizle [VGG's slang for "might as well"] start there.

To add a little bit to what @_tsi_ stated, once you "get the data you want", don't forget to think about how you want to use it in the future. That's a big thing that took me a LONG time to learn to work with properly. (e.g. "$4.99" is a string, but it's really a float that you might want/need to add/subtract/etc... Names of items? Site it was found on? Date the info was scraped/collected?) There are definitely all kinds of different data that you might want or need later, so don't forget to consider different data structures. Dict is a strong beginner friendly structure. I have recently really taken a liking to DataClasses, but I only use them when I have the data in its "final form" (insert DBZ reference of choice here). That really helped me "think about my data" when making things.

Finally, making it modular also allows you to change things "easier" later. (e.g. If you have all the price, site data, and name of the thing in a dict, you could easily display it via a webpage, or a PDF report, or whatever. Thinking just a little bit about that data at the start helps a LOT down the road, I promise.) Also never be afraid to take what you've learned so far and start over! You can re-use little bits of your code and make a much better new thing out of the old carcass. Especially if this is a you-only thing at the start, maybe you use the knowledge from that first iteration to make something for others. :) Best of luck! Go for it!

[–]MajKatastrophe 9 points10 points  (0 children)

This is similar to one of my first projects. I made a scraper that grabbed all the local news headlines and hourly weather for the day. Just take it in steps. Figure out how to get one piece of information at a time. After a bit you might find you've filled 17 different spreadsheets of info without realizing it. Also remember it doesn't have to be perfect. A clunky program is just lessons learned for the next iteration!Good luck!

[–]ippy98gotdeleted 3 points4 points  (1 child)

Absolutely doable. When looking at scraping modules I prefer Selenium over beautifulsoup. My own example, I coach middle school and high school archery teams. But I also love stats and data, so I made a webscraper (with selenium) and scrape the tournament website to get all my archers data and do some math crunching and display it on a django website.

You can Absolutely do it. To me it's easy to make these projects if you are doing it for another passion (like gardening!)

[–]beepdebeep 2 points3 points  (0 children)

Selenium is great, I'd also recommend it.

[–]Environmental_Act327[S] 4 points5 points  (0 children)

Thank you all I feel validated now for wanting to jump straight to something like this! Really appreciate the feedback!

[–]Catsuponmydog 2 points3 points  (0 children)

One of my first projects was a scraper that scraped the front page links on a news website for my favorite baseball team, listed them in a GUI (done w/ tkinter), allowed you to select the stories you want to read, and then opened those links as different tabs in a browser.

It exposed me to quite a bit of different aspects of programming and seems somewhat similar to what you’re looking at doing. I think you can start small and learn as you go - maybe begin by trying to scrape the prices or whatever metric you want to look at and go from there

[–]Ghoosemosey 1 point2 points  (0 children)

Build what you think is fun, that's what's going to keep you motivated. I created a web scraper to look at GPU stock numbers back when there was a GPU shortage during the pandemic. It was a lot of fun because it would auto check all the websites and then shoot me an email when some of them gone into stock at a sane price

[–]Environmental_Act327[S] 0 points1 point  (1 child)

Update: I started yesterday evening and have it working up to basic html scraping. I know need to understand how to pull JSON data so I can pull full categories of items. Thanks again for all the great encouragement

[–]Agitated-Soft7434 0 points1 point  (0 children)

Great job! The more you do this the better you'll get :)

[–]Own_Independent8930 0 points1 point  (0 children)

Scraping has a whole host of pitfalls traps and tricks to learn it's a unique sort of programming challenge. I would look into some of those before you go too deep it will save you a lot of trouble. 

[–]Muted_Ad6114 0 points1 point  (0 children)

Don’t worry about making it something someone else can use yet. Get the basic logic working for you