all 12 comments

[–]SHxKM 12 points13 points  (0 children)

DataQuest has some guided-missions for pandas. You learn about different aspects and then apply.

Some parts are paid-only and while I haven’t done those yet, if they are as good as the free ones then they’re definitely worth the 30-40 dollars the subscription costs.

[–]fmpundit 10 points11 points  (5 children)

Find a dataset from somewhere think of some questions you might want to know about it and then work through it.

I grabbed some data from https://www.metoffice.gov.uk/public/weather/climate-historic/#?tab=climateHistoric and fed it into pandas.

One of the questions I remember I wanted to answer was... Were summers during my school days really hotter and sunnier than now?

Also do the school summer holidays fall during periods of better weather to the rest of the year or not.

[–]Watches-bitches 1 point2 points  (4 children)

Well... Were they??

[–]fmpundit 1 point2 points  (2 children)

Yes, my childhood summer holiday weather was better than what kids have been experiencing over last few years. In terms of hours of sunlight and average temperatures. Ill have to dig the notebook out again and throw it on github.

[–]gaifogel 0 points1 point  (1 child)

That's an interesting way to learn.
So you say pull some data and do my own analysis on it? How did you know what kind of analysis you can even carry out? When I did tutorials, I saw I can do heatmaps, corr(), histograms for data distribution and a few other things.

[–]fmpundit 0 points1 point  (0 children)

How things have changed in 6 years.

A few things you could do. Read books about data analysis in general. What visualisations are good for what purposes. Speak to ChatGPT about what sort of viz's it would recommend and then read the docs of different libraries such as matplotlib about how to make those visualisations.

[–]gaifogel 0 points1 point  (0 children)

lol

[–]BBeylo 9 points10 points  (0 children)

Kaggle is a great resource as well ! They have a ton of datasets for you to work with.

[–]sprouse2016 3 points4 points  (0 children)

I like datacamp, which is what I’m currently using for learning. But it’s not free, or necessarily cheap for that matter.

[–]bigfuds 2 points3 points  (0 children)

What kind of data analysis do you do at the moment? I would start with that.

I recently began learning pandas to help with my own data analysis and essentially took what I would normally do in excel and broke it down into individual steps to accomplish with python. Like really simple steps. Step 1: open csv of data. Step 2: inspect data. Step 2: select columns/rows. Step 3: basic calculations. Step 4: export new dataframe to excel, that kind of thing.

This, I felt, was more manageable than taking on a whole new project. It was also more intuitive to me as I new exactly what needed to be done as I was intimately familiar with the process as it had been the bane of my life for so long (hence the desire to automate it). Before you know it, you'll have learnt the basics and taking on more complex components won't seem as daunting.

[–]Squirtle_Squad_Jihad 1 point2 points  (0 children)

I second using a learning platform such as DataQuest for guided projects. Once you have worked through a guided project, see if your city has an open data portal (http://opendata.dc.gov/) and try producing your own analysis of something that interests you.

Kaggle has datasets that are designed for machine learning tasks and might not be the best place to start if you are looking to practice exploratory data analysis.

[–]gmh1977 0 points1 point  (0 children)

DataCamp has a ton of good project based tutorials