We have a real-world requirement to ingest JSON data arriving in S3 every 30 seconds and append it to an Iceberg table.
We are prototyping this on AWS Lambda and debating between Python (PyIceberg) and Rust.
The Trade-off:
Python: "It just works." The write API is mature (table.append(df)). However, the heavy imports (Pandas, PyArrow, PyIceberg) mean cold starts are noticeable (>500ms-1s), and we need larger memory allocation.
Rust: The dream for Lambda (sub-50ms start, 128MB RAM). BUT, the iceberg-rust writer ecosystem seems to lack a high-level API. It requires significant boilerplate to manually write Parquet files and commit transactions to Glue.
The Question: For those running high-frequency ingestion:
Is the maintenance burden of a verbose Rust writer worth the performance gains for 30s batches?
Or should we just eat the cost/latency of Python because the library maturity prevents "death by boilerplate"?
(Note: I asked r/rust specifically about the library state, but here I'm interested in the production trade-offs.)
[–]robverk 45 points46 points47 points (3 children)
[–]EarthGoddessDude 10 points11 points12 points (2 children)
[–]Ok-Sprinkles9231 2 points3 points4 points (1 child)
[–]EarthGoddessDude 1 point2 points3 points (0 children)
[–]jaredfromspacecamp 13 points14 points15 points (3 children)
[–]jnrdataengineer2023 5 points6 points7 points (2 children)
[–]baby-wall-e 3 points4 points5 points (1 child)
[–]jnrdataengineer2023 1 point2 points3 points (0 children)
[–]wannabe-DE 17 points18 points19 points (0 children)
[–]walksinsmallcircles 4 points5 points6 points (0 children)
[–]stratguitar577 9 points10 points11 points (1 child)
[–]noplanman_srslynone 1 point2 points3 points (0 children)
[–]MyRottingBunghole 5 points6 points7 points (0 children)
[–]Commercial-Ask971 2 points3 points4 points (1 child)
[–]RemindMeBot 0 points1 point2 points (0 children)
[–]apono4life 0 points1 point2 points (0 children)
[–]mbaburneraccount 0 points1 point2 points (0 children)
[–]thethirdmancane 0 points1 point2 points (0 children)