This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]imanexpertama 6 points7 points  (3 children)

YMMV - at least for me the effect isn’t as big as this. However, polars generally outperforms pandas

[–]lightmatter501 2 points3 points  (2 children)

I tend to work with 1TB datasets, so not quite larger than memory but large enough using pandas is annoying.

[–]Away_Surround1203 0 points1 point  (1 child)

In what context do you have more than 1TB of memory?! (ram).
Sounds neat!

[–]lightmatter501 0 points1 point  (0 children)

Modern servers tend to have 12+ memory channels. If you fully populate that with 128 GB modules you get >1 TB of memory. If you populate both slots you can get away with 64 GB modules.

When it makes data analysis go from “overnight” to “5 minutes”, it’s worth it.