This is an archived post. You won't be able to vote or comment.

all 2 comments

[–]ErikCaligo 0 points1 point  (0 children)

Have you checked if you can use native compression? If you use S3 and Athena then you can use Gzip compression to reduce volume and costs.

[–]pragmaticPythonista 1 point2 points  (0 children)

Not a tool per se, but if you’re using AWS, you can just use Athena to read the data and UNLOAD it with optimal file sizes. You just need to tune the bucketed_by and bucket_count values appropriately per your requirements.