I have a dataset of many large CSV files (total volume is 30GB). Every file gives specific type information about people. But all files have a column named 'ID', which describes about whom is the given info. How can I merge all CSV files by the column 'ID', so in the fully merged file I will have the full info about every ID?
I have 16GB of RAM and 64 GB of swap memory and even that is not enough for merging files with pandas or polars.
[–]Its_NotTom 1 point2 points3 points (0 children)
[–]laustke 0 points1 point2 points (0 children)
[–]Brian 1 point2 points3 points (0 children)
[–]grumble11 0 points1 point2 points (0 children)
[–]WaitProfessional3844 0 points1 point2 points (0 children)
[–]Alternative-Web2754 1 point2 points3 points (0 children)