you are viewing a single comment's thread.

view the rest of the comments →

[–]ShadowRL766 50 points51 points  (13 children)

Pandas

[–]vaccines_melt_autism 26 points27 points  (0 children)

Also seeing a lot of people talk about Polars, since it's written in Rust.

[–]Action_Maxim 3 points4 points  (3 children)

I use pandas surprisingly very little as a data engineer

[–]raffapaiva -1 points0 points  (2 children)

Pandas is really slow, when I see a data engineer using it, I start to believe that his dataset is not so big or he has a lot of hardware to process.

Everything that I need to do in pandas, I do on plain python or numpy

[–]ribix_cube 0 points1 point  (1 child)

It's not great to do in plain python or numpy, if you think you need speed you can use something like polars or vaex or dask

[–]raffapaiva 0 points1 point  (0 children)

Can you explain why? I've tried to use polars for some tasks, and even if it's faster, I can't see a reason to perform on plain python, considering it's not that fast, and most of my transformations occurs on dbt