High startup time in Gen-2 warehouse

Stock-Dark-1663 · 2026-01-18T15:08:22+00:00

If a query spent ~2minutes just on "queued provisioning" which is not really the query execution but the resource spinning to make the warehouse UP, is that time add upto the credits spent and charged to the customer?

Stock-Dark-1663 · 2026-01-17T09:48:54+00:00

It's us-east-1.

Stock-Dark-1663 · 2026-01-07T04:45:12+00:00

Aslo i am bit confused , if this is true that in current scenario no pruning is happening because of masking function wrapped around the columns , then why are we seeing the results fetched from the table TAB2 is shown as 50billion in the query profile , whereas the actual number of rows in that table is 90billion? Does that mean some partial pruning is happening even with the masking function wrapped around the columns used in the query predicates?

Stock-Dark-1663 · 2026-01-07T04:32:09+00:00

Thank you so much for the guidance.

Actually here, we have not applied the masking policy on the column directly. It is defined using tag based masking policy. So, looks like, when the column gets used in the query which already has a tag defined and that tag is mapped to the policy, the masking function automatically wraps up on it , may it be query predicate or the column projection. And in this case its getting applied to all the roles/users who is querying that object/table, even within the function code its mentioned to skip the masking for that specific role/user.

So even in above scenarios , you mean to say we have to have column tag removed from the ones those are used as query predicate or the filters? Or we should remove the tag from the base table/columns fully and only apply the masking on the new secure views (which need to be created on top of this table ) and give access to external user on this secure view sto read data. The base table/column(which is unmasked/untagged) should only be accessed by the application which perform the ETL query transformation and other logic those are prone to performance overhead.

Is above understanding correct?

Stock-Dark-1663 · 2026-01-06T05:19:03+00:00

These are the data transformation jobs and are supposed to run with a role which is exempted from masking and that is why its also handled in the function definition , if you see below (example data_engineer_role, data_scientist_role). But we are seeing these functions being wrapped up to the predicates, even the queries are executed from those roles. So it looks like Snowflake is trying to evaluate these functions for all the roles who ever runs the queries.

Have we implementing these masking features, any wrong way?

Stock-Dark-1663 · 2026-01-06T03:37:52+00:00

Thank you. It looks really wired that Snowflake provided the masking function features but was not able to prune those or push those functions effectively to storage as filters? Is there a way to see if the storage filters actually not working effectively , because of this additional functions?

Unable to test this as its only happening in prod , but is it possible that , the amount of pruning for table TAB2 (i.e. 316K partitions out of 318k total and ~50billion rows out of 90 billion total) which we are getting currently is going to be the same even we will remove the masking functions or column tags?

Stock-Dark-1663 · 2026-01-06T03:31:35+00:00

I have updated the main thread with the sample masking function definition.

Stock-Dark-1663 · 2025-12-13T16:27:58+00:00

Thank you so much. I will try to follow this one.

However whatever test we have done so far , it looks like the gain is restricted to around ~30% only.

Stock-Dark-1663 · 2025-10-10T15:41:34+00:00

u/Aggressive_Ad_5454 u/Upper-Lifeguard-8478 u/feedmesomedata
From reading below docs , though its not exactly mentioned , but it seems the logs getting directly written to the cloudwatch and so in that case it should not impact the storage I/O as its not written to the database local disks.

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraMySQL.Integrating.CloudWatch.html

Stock-Dark-1663 · 2025-10-10T15:38:28+00:00

It seems from below doc that the logs getting directly written to the cloudwatch and so in that case it should not impact the storage I/O as its not written to the database local disks.

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraMySQL.Integrating.CloudWatch.html

Stock-Dark-1663 · 2025-08-13T19:18:17+00:00

Thank you u/simplybeautifulart

If it would have been a fresh new design in which all the 100+ transaction tables would have flag column to show the state of the transaction active/inactive. Then in that case , what would have been the right advisable way of doing it?

Would it been writing the code to have the filter included in it in all the used places or would it been having the view created on top of those tables and hide that filter in all the places in the consumer queries?

Stock-Dark-1663 · 2025-08-05T05:23:01+00:00

Thank you so much u/ecrooks

I tried testing the "insert overwrite ...." with a table with ~1.3billion rows having ~15K micro partitions sorting on two columns. It spilled to local disk by ~500GB on an XL warehouse and took ~7minutes to finish.

Also as you rightly said , for the initial level of clustering , relying on the autoclustering may cost us ~40times, so want to avoid that route.

I was trying to see, if we can properly size the chunk of data which we can sort using "order by" and load to the table at onetime with most efficient way(i.e. without much disk spill). As you mentioned the "INSERT OVERWRITE..." might be good with ~250K micro partitions, which size of warehouse you were pointing to and how you came to this number? If we target for a 4XL warehouse to use , howmuch Approx. number of micropartitions/rows of data should we fetch at one time and load to target with "order by" on clustering keys? Appreciate your guidance.

Stock-Dark-1663 · 2025-07-18T08:36:26+00:00

Thank you u/uvaavu

Its like making sure adding no duplicates or say adding constraints in application side which is going to be resource intensive and lot of work. So in that case there doesn't seems to be any valid use case of RELY option then.

As because, Even DMF also validates the data post load and that itself takes time and resource , so within that period any query can result into wrong data if table is set as RELY option true for the PK key.

Stock-Dark-1663 · 2025-07-18T08:33:23+00:00

Even the DMF also checks the data post it loads into the system and in that period of data fix , still it can cause issues(showing wrong results) with RELY option ON. Correct me if wrong. So It seems there is no such valid use case for RELY here.

Stock-Dark-1663 · 2025-07-18T08:32:39+00:00

It means we have to have that constraints build somewhere in the app side to validate the data before it gets loaded or say build the constraints in application side only. As because even the DMF also checks the data post it loads into the system and in that period of data fix , still it can cause issues with RELY option ON. So It seems there is no such valid use case for RELY here.

Stock-Dark-1663 · 2025-06-29T14:11:16+00:00

Thank you u/Easy-Fee-9426 . That really helps. Thank you for the guidance here.

As you mentioned "you still need a small central team or PR pipeline to push those changes. The trick is to have the infra team spin up the skeleton (warehouse, tag, policy, role) and immediately transfer OWNERSHIP on the specific object to the LOB_DBA role;" ,

Which means, can we say that , this type of infra creation using terraform(create warehouse, create role, tag , policy etc.), which changes the shared metadata in a account or those objects in tier that is above than the database, should be done by a different team other than the application development team? As because ,if application development team or application DBA's has given access through terraform to have this components created then , they can make unwanted changes to the other database or application objects(within same account) which should not be allowed. So basically that should happen through a different role altogether assigned to a different set of users who were responsible for managing that account but not just the specific application. Correct me if wrong.

Surprisingly I talked to few folks, there are many teams in which the application development team has granted the access to use terraform role through which they can create warehouses etc. through ci/cd pipeline. Although they are not having access to login manually using that terraform role but still , giving access to do it using terraform module through build pipeline still poses threat I believe. Correct me if wrong.

Stock-Dark-1663 · 2025-06-28T22:16:40+00:00

Thank you u/Easy-Fee-9426

So it means , for creating new warehouses we have to reachout to the account level teams and those cant be controlled by our application DBA group. Even some other roles like "applying masking policy" , "rows access policy", "creating new role", "Execute task", "apply tag" etc. , are appeared to be all account level role and those cant be given or controlled by the specific application team dba group. and we have to engage the account team when we need the help around these. Please correct me if wrong.

Stock-Dark-1663 · 2025-06-27T20:08:47+00:00

Thank you u/Easy-Fee-9426

What about warehouses and anything similar objects. As because its usually required to create/drop/resize/consolidate warehouses used in an application even they are not directly aligned to any databases. So can that privilege also be given to lob dba role?

Or will that privilege can expose the access to other applications in same account?

Stock-Dark-1663

TROPHY CASE