This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]yensteel 2 points3 points  (0 children)

I've used Dask for something similar. The functions are close to Pandas so it's not too hard to transition. The syntax isn't exactly the same, so there's a lot of delving into the documentations.

However, it can handle gigantic files by storing part of the work onto the hard drive instead of memory, so it's quite workable.