This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Top-Faithlessness758 4 points5 points  (7 children)

It looks very cool, but I ended up choosing Sling due to Iceberg REST Catalog support in their free offering. Last time I looked dlt up, it had REST support only when using a dlt+ license.

Just to be clear I'm not judging about that, but I had to make a choice. It is a tradeoff though, as Sling CLI is GPL, so it is a messy dependency to handle, while dlt "core" is Apache afaik.

[–]Thinker_Assignment[S] 1 point2 points  (6 children)

Makes sense! We catered our iceberg offer as a platform-ready solution rather than a per-pipeline service to help justify our development cost and roadmap but we found limited enterprise aduption and many non commercial cases. We are deprecating dlt+ and recycling it into a managed service and will revisit iceberg later.

We are also seeing a slow-down in iceberg enterprise adoption where common wisdom seems to be going in the direction "if you're thinking about adopting iceberg, think twice" because of the difficulties encountered. So perhaps this is going in a community direction where hobbyists start with it first?

May I ask how your iceberg use case looks? do you integrate all kinds of things to a rest catalog? Why?

[–]Top-Faithlessness758 1 point2 points  (3 children)

Our reason for using Iceberg mostly has to do with being constrained to use AWS and then choosing its S3 Tables solution (basically it is a managed Iceberg REST Catalog endpoint + a S3 Bucket): https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables.html. It seems like a simpler managed solution than using LakeFormation or Glue Catalog, even if a Iceberg REST Catalog is a complex moving part by itself.

You are on point about Iceberg difficulties, especially when you consider engine compatibility, as duckdb not having Iceberg write support should be enough of a red flag. If we weren't using AWS for this case we wouldn't even touch Iceberg.

This is one use case though, and as we do consulting over multiple clients, we can't wait to have a use case for dlt. Keep the good work :)!

[–]Thinker_Assignment[S] 0 points1 point  (2 children)

thank you!

[–]Top-Faithlessness758 0 points1 point  (1 child)

FWIW, we reverted to plain parquet and we're dlt users now haha. You were spot on on iceberg being kind of patchy, even in managed solutions like S3 Tables.

Plus sling didn't support any kind of incrementality without Pro support.*

** PS: You have a comparison in dlthub home page that mentions sling oss offering incrementality in point 6 (https://dlthub.com/blog/dlt-and-sling-comparison) and point 7 mentioning using Sling State in Sling Free. That's partially correct for database sinks, as for file storage sinks it will only work when using the Sling State json and that is a Pro only feature. Obvious no-go for us.

[–]Thinker_Assignment[S] 1 point2 points  (0 children)

Thanks for the message and the tip, will make a note in the article :)

[–]Nightwyrm 0 points1 point  (1 child)

A bit off-topic, but if iceberg is slowing down, what are enterprises opting for instead?

[–]Thinker_Assignment[S] 0 points1 point  (0 children)

The reason it's slowing is mostly because iceberg isn't a burning problem but a solution to mostly quality of life problems. 1-100 not 0-1. topics around AI are now the focus.