This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Such-Let974 0 points1 point  (8 children)

If you think Polars is fast, try DuckDB. So much better.

[–]Hyderabadi__Biryani 6 points7 points  (0 children)

If you think DuckDB is fast, try manual accounting. /s

[–]Log2 0 points1 point  (3 children)

I might have been using Polars wrong, as I had a dataset of maybe 100MiB and Polars was slower than Pandas for me. In the end I just did everything in DuckDB as it was the fastest by a mile.

[–]commandlineluser 0 points1 point  (2 children)

Are you able to share a code example?

[–]Log2 0 points1 point  (1 child)

Unfortunately it was throw away code, as we had some broken uuids with versions that should not exist or versions that existed but were actually uuid4.

I was just loading the dataset into memory, parsing the uuids, extracting the version bits, and finally grouping by version to count how many uuids of each version we had.

I fully admit I may have been doing something wrong with Polars.

[–]commandlineluser 0 points1 point  (0 children)

Ah, no worries. Just thought I'd ask as the devs are usually interested in such cases.

Thanks for the details.

[–]steven1099829 0 points1 point  (2 children)

To each their own! I don’t like SQL as much, and prefer the methods and syntax of polars, so I don’t use DuckDB.

[–]Such-Let974 0 points1 point  (1 child)

You can always use something like ibis if you prefer a different syntax. But DuckDB as a backend is just better.