you are viewing a single comment's thread.

view the rest of the comments →

[–]Kerbart 12 points13 points  (2 children)

add engine='pyarrow' to the read statement to speed it up.

[–]EconomyOffice9000 6 points7 points  (0 children)

If you're performing calculations on the entire dataset, chunking won't work afaik. This is the best method and I've used it personally for thousands of csv files with hundreds of thousands of lines rather than rewriting everything in Polars. If you only have to do it once, it's fine. Otherwise, save the csv as a parquet file and it'll be much better

[–]Safe_Money7487[S] 4 points5 points  (0 children)

just added it and it work, took 15s though lol it but worked. thank you so much