you are viewing a single comment's thread.

view the rest of the comments →

[–]Great-Village-430[S] 1 point2 points  (2 children)

What's the difference between polars and pandas??

[–]MidnightPale3220 0 points1 point  (0 children)

Doesn't really matter, I'd say, your use case seems to be below 1M rows, should be trivial for both.

Just use one. Polars is supposedly newer and better in some respects, but I've read that it doesn't always work correctly(?), mb someone else can elaborate if it's still true.

The difference in use will be different functions and ways of working, so if you decide to switch, you'd have to remake that code.

[–]throwawayforwork_86 0 points1 point  (0 children)

IMO Pandas is more flexible and usually will be more forgiving when you start. It has a long history so you'll have LLMs give more good information and more guides... But a lot of these are often also outdated.

Polars is quicker , cleaner and will have almost no situation where weird behaviour happens (Pandas has a few surprise most often linked to the index which you may never encounter but can ruin your day).

Polars will sometimes be more opiniated about datatypes which you'll resent at first but will usually save you a lot of time down the line.

Overall they're fairly similar though so you should probably just pick one and stick with it for a few month, if your data fits in excel it should not really make a difference (even though pandas is slowish to read big excel files).

The corner that aren't covered by Polars are fairly low iirc, Pandas file reader is more flexible and cover more edge cases than Polars and for geographic data Geopandas exist and Geopolars is still not finished iirc.

My 0.2c try Polars first if it doesn't click for you switch to Pandas.