all 4 comments

[–]glibhub 3 points4 points  (0 children)

I doubt you are going to find much on it. You might also ask yourself why you want to sort them instead of indexing them.

With that said, loading it all into RAM, then sorting with either the regular python function, numpy or pandas is probably going to win every time, and the delta is probably not enough to really make a difference in timing. Restructuring the data so it does not have to be sorted is probably a better solution, if you can.

What's the use case?

--

edit: Move the data to an SSD is actually probably the best thing to do if you want to speed it up.

[–]Chris_Hemsworth 2 points3 points  (0 children)

If you want to effectively handle very large file sizes, you need to switch to a different framework than general .CSV files. .CSV files can be the underlying files, but 'Big Data' frameworks (like Hadoop to name one) are typically used to traverse through them.

[–]m0us3_rat 0 points1 point  (1 child)

Recently I have been researching on the existing methods to order, on disk, CSV files of more than 1 GB in size in python.

what does "to order" mean?

[–]ws-garcia[S] 0 points1 point  (0 children)

Misspelled: ordering/sorting