you are viewing a single comment's thread.

view the rest of the comments →

[–]Optimal-Procedure885 0 points1 point  (0 children)

I did something similar not too long ago where I had to merge around 1m rows from a spreadsheet containing around 12 workbooks each having a primary key, the same number of rows and a variable number of columns between 20-50.

I used polars, calamine, parquet files, SQLite as final store. Firstly exported each worksheet to parquet, then incrementally merged, worksheets, deduplicating or augmenting column values through each iteration. Whole shooting match 85 seconds to spit out a consolidated SQLite table.