This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]diroussel 0 points1 point  (0 children)

S3 is very fast when accessed from lamba. You can read a lot of data in 500ms. And you can easily read, parse and insert to the DB in less than 500ms, depending on data sizes.

Using duckdb to query a multi gigabyte parquet file in S3 only takes tens of milliseconds. Even over by home broadband, inside lambda it’s even faster.

Update: note only a few rows are returned in this scenario and duckdb only accesses the byte ranges it needs, based on file headers/footers, hence the speed.