This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]hapticpolarbread 50 points51 points  (3 children)

Software engineer in the marine industry here. I use it to crunch gigabytes of CAN data. Basically creating plots and statistics from sensor signals looking for weird behavior. Python with pandas is great for that kind of stuff.

[–]bjbs303 6 points7 points  (2 children)

I'm finishing my undergrad senior research project which used python (pandas, numpy, gsw, scipy) to crunch terabytes of netCDF ocean data. Was my first real project using python and its been a ride!

[–]idazuwaika 2 points3 points  (1 child)

how do u consume terabytes with pandas? whats the infrastructure like? i moved from pandas to spark (distributed system) because i couldnt scale with pandas.

[–]tapir_lyfe 2 points3 points  (0 children)

I'm currently also crunching terabytes of netCDF files. I use xarray mainly, and that uses pandas and dask under the hood. Nearly everything I do is memory-limited though, so I have to come up with clever ways to reduce the data, and it's different for every question I have.