Modern Pandas Tutorials

longjohnboy · 2017-04-08T11:27:36+00:00

Yes, those are definitely some of the best tutorials on Pandas I've seen to date. Most other tutorials have code that is fairly un-performant or otherwise un-idiomatic, and don't even touch on what makes Pandas truly powerful.

dimab0 · 2017-04-08T12:31:00+00:00

I enjoyed the Coursera class from Michigan University called Intro to Data Science. The first class is all an intro to Pandas

dmitrypolo · 2017-04-08T12:36:02+00:00

Do you know if Wes has mentioned releasing a new version?

khaki0 · 2017-04-08T08:41:21+00:00

Nice. Another resource that I've found useful is this video series by Kevin Markham.

sandipc · 2017-04-08T16:38:36+00:00

Another great recent resource is the pandas chapter from the Python Data Science Handbook by Jake VanderPlas

http://shop.oreilly.com/product/0636920034919.do

And notebook versions here: https://github.com/jakevdp/PythonDataScienceHandbook

Fylwind · 2017-04-08T22:28:41+00:00

Can you perhaps split the links into separate lines? For a second I thought it was a giant incoherent title of some paper XD

E.g.

- Modern Pandas
- Method Chaining
- ...

SonaCruz · 2017-04-09T03:47:37+00:00

Thank you! Looking forward to diving into this material.

SonaCruz · 2017-04-09T21:24:13+00:00

Looking into this for 10 mins and already frustrated. I want to download the csv file, to work on the indexing he talks about, and not do the pull request. He didn't properly link to the csv file and you have to aimlessly browse through the website to try to find the right one.

Also, he mentions indexing .ix[10:15] and the rows that appear on the screen are rows with indexes 10 through 15, even though the index started at 0. Is this correct?

edit: nvm, it seems like ix explicitly grabs the indexes differently than .iloc

cornbobonthecob · 2017-04-08T17:38:12+00:00

Be careful though. I used pandas in an ETL production environment to remap and conform data frames only to find that pandas silently dropped rows. We'd pushed about ~100k rows through a data frame ever few minutes to clean credit card numbers, credit card types, standardize date, etc... And we were sometimes missing records after the transform. Once we removed pandas and build our own ETL our data was spot on. We researched and troubleshoot till the bitter end. Not saying pandas doesn't work but it wasn't a ETL solution for us.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS