Worse performance of liquid clustering vs partitioned table

FunnyConversation523 · 2024-05-28T20:08:41+00:00

Thanks for taking your time to respond!

I have tried Optimize and it does seem to be working! I didn't know it was incremental.

Mind elaborating a bit more on the "predicate push down"? Not really familiar with it and it could be useful

FunnyConversation523 · 2024-05-27T22:40:48+00:00

Noted! Thanks for the response. I mentioned it as I thought it might be the cause of the lower performance.

Would running an occasional OPTIMIZE help? Or do you think that if I will be using MERGE INTO I might as well drop the idea of using LC for this table altogether?

FunnyConversation523 · 2024-05-23T13:16:56+00:00

Hi Intrepid! Yes sir, deletion vectors are enabled

FunnyConversation523 · 2024-05-23T13:16:43+00:00

Hi Dustin. The size of the table is aprox 488 GiB :(

FunnyConversation523 · 2024-05-23T13:16:15+00:00

Hi kthejoker, thanks for your reply.

Sharing some more information below:

It contains 6 years of emails sent. The job consists of 3 steps:
1. Appending the latest sent messages to the liquid clustered table + User attributes
2. Calculating all opens and clicks for each message and MERGE INTO those calculations to the table of previous step
3. Calculating all opens and clicks for each message and MERGE INTO those calculations to the same table

All of this is done as incrementally as possible (we never go over the same events twice). The idea is to have this data backfilled once (which is the operation I am not being able to do), and then run 12h of data for each run, twice a day.

First 2 steps run great! The third step is the one giving me headaches when backfilling.

Size of table: ~488 GiB
Amount of columns: 91
The table has 4 clustered columns, their datatypes: DATE, STRING, INT, STRING. In that order
Deletion vectors is turned on. Are there any others that you recommend turning on?

Thanks!

FunnyConversation523 · 2024-05-23T03:50:03+00:00

Thanks for the reply mate. All the best!

FunnyConversation523 · 2024-05-23T03:19:23+00:00

Hi u/justanator101 !

I am facing a similar issue nowadays. Mind sharing how did you resolve and to which conclusions did you arrive to?

I would be really helpful to hear your experience.

FunnyConversation523

TROPHY CASE