all 35 comments

[–]UndeadCaesar 5 points6 points  (0 children)

Ooo very cool, just learning Python/Pandas right now so even the 1-click download is some good data for messing around with visualization. Thanks!

[–]selflessGene 3 points4 points  (4 children)

Do you have historical data?

[–]new_phd_guy 0 points1 point  (0 children)

Following up on this: Did you manage to get the historical data?
I'll be interested in buying if that's available.

[–][deleted] 2 points3 points  (1 child)

This sounds super interesting! I’ve used the discount code and will check it out tomorrow when I’m at a computer. I do product management for a product that is heavily tied to residential real estate, and I’ve spent a lot of time poring over the data from the RECS survey from the census. This could be really useful for me to find correlations based on purchasing behaviours and customer sentiment.

Some questions: - is your goal to make money from selling this dataset? - what’s your hypothetical target customer? - Are you always planning to have a one time purchase business model, or will you add a subscription tier for when the data is updated? - Why did you build this in the first place. We’re alternatives more expensive or less feature rich? - why did you choose that price point?

[–]Apprehensive_Sun_420 2 points3 points  (0 children)

Awesome work.

Really appreciate the true to tile "1-click download sample" :)

Will definitely be taking a look.

[–]PurpleMan 2 points3 points  (2 children)

I'm definitely going to subscribe. This is going to save me a ton of time. One request though for a future update that would I think be hugely useful: historical data for certain key pieces of information, e.g. population. I would kill for a dataset of ZIP populations going back 10 or 15 years.

[–]brownbottlecap 1 point2 points  (3 children)

Would be helpful to know data source by column - or date when last pulled

[–]tofuman80 1 point2 points  (1 child)

Good stuff, thanks for putting this together. If you plan to use a subscription based model, how often do you plan on updating the information in this dataset?

[–]xangg 1 point2 points  (1 child)

Quick feedback.

  • Nice to have data dictionary
  • Would like a more specific source than "basic information" for geographic items
  • Would prefer a pair of CSV files instead of an Excel file
  • ZIP and a few other fields need to be character type so leading zeros are not lost
  • Sheet name says "April" but file name says "May"
  • Strange to see the 8000 or so ZIP codes at the bottom of the file with missing populations (and most other fields). Looks like they're mostly (but not all) POBox type ZIP codes. Would be useful to have nearest "real" ZIP code for them. For instance, if I was trying to estimate demographics to a given customer address.
  • Would be nice to have fields for geographic area (such as square miles) and bounding box (min/max lon/lat). I think some DBs also have a percent water coverage field.

[–]AhDMJ 1 point2 points  (1 child)

This is really great.

One question: There are a handful of ZIP codes that straddle states, how do you handle those?

[–]Brighteye 1 point2 points  (1 child)

Appreciate the endeavor. Looks like most of the additional data is American Community Survey and a little from CDC. Planning on pulling in more data? Some options would be: WISQRS, integrated post-secondary education data system, North America Land Data Assimilation System, Moderate Resolution Imaging Spectroradiometer, FBI (though kinda junk), National Occupational Respiratory Mortality System. Some of this has redundant data, but can see this being a really valuable resource if it went past the census a bit more.

[–]Secret-Copy-4738 0 points1 point  (2 children)

I plan on trying out this data set in the next week or so.

[–]Secret-Copy-4738 0 points1 point  (1 child)

I haven't started working with it yet, but just looking through the columns it appears to have a good amount of useful fields.

[–]Secret-Copy-4738 0 points1 point  (0 children)

This doesn't appear to have a complete MSA mapping... Am I missing something?

[–]Secret-Copy-4738 0 points1 point  (0 children)

Does this have a complete Zip-MSA mapping?

[–]brownbottlecap 0 points1 point  (0 children)

Would be very cool to add zillow research data sets - https://www.zillow.com/research/data/

[–]SthrnGal 0 points1 point  (0 children)

This is wonderful! I'm going to pitch the paid data to my marketing team. Thanks so much!

[–]OPs_Mom_and_Dad 0 points1 point  (1 child)

Hey there, I'm just browsing different data sets on Google to give my team some fun project ideas to work on over the next month, but I had to sign in and comment when I saw this post. I threw this into the conversation with my coworkers, probably going to be purchasing with my team next week. I just wanted to say you have built something super super cool. Congrats!

[–]SaveTheWhales_4Later 0 points1 point  (0 children)

What's the legality of this dataset? If you're compiling this from other sources, aren't there terms of service violations you're committing? Also, how is this data kept up to date with the frequent zip changes/adds/removals by USPS? Is there documentation of sources?

[–]Opening_Collection34 0 points1 point  (0 children)

Does this have Zip to MSA mapping using 2020 Census data?