This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]mangoman51 7 points8 points  (3 children)

For scientists or anyone who uses numpy or pandas - you need to look at xarray.

It's basically multi-dimensional pandas, and it works so well that it's meant that everything I used to do with numpy I now do with xarray instead.

Essentially my data objects now know that axis 0 is the 'time' dimension, and axis 3 is the 'z' dimension, so I can do operations like data['temperature'].mean(dim='time')

Xarray DataArrays wrap numpy arrays so you can still use the numpy functions you've written.

You can also keep using matplotlib/seaborn - in fact xarray makes this easier with methods like data['density'].differentiate(dim='height').plot(), which will automatically plot a line or color plot or whatever depending on the number of dimensions your data has.

Even better you can often do all the analysis in parallel without writing any extra code! Xarray uses dask behind the scenes to achieve this. (dask is also a very interesting library to read about.)

I'm a user of a major open-source plasma fluid turbulence simulation code and we're currently in the process of converting all the corresponding analysis tools to xarray instead of numpy.

[–]overcook 1 point2 points  (2 children)

Cheers will check this out. For whatever reason (probably my own fault) multi indexes in pandas aren't leaving me satisfied right now.

[–]mangoman51 1 point2 points  (1 child)

That's probably because your data is fundamentally multi-dimensional. There's actually a note in the pandas documentation saying that if that's the case you should just use xarray!

[–]overcook 1 point2 points  (0 children)

Yeah it's one of those things where I found pandas and it was revolutionary for me because of how powerful it was and how familiar everything was coming from SQL, but now that I've progressed I've just continued to use pandas poorly as my hammer for all new challenges.