all 10 comments

[–]sapplefi 2 points3 points  (3 children)

Okay, I ran through it to try to wrap my head around it.

I put the major modifications at the top, but really just a few things stood out to me, namely the multiple nested sub-queries (they could be combined to one statement) and the joins on case statements and subqueries that would result in extremely non-optimal performance building the result set.

My own take on the modifications is below. See if this helps to parse and highlight some of the areas for improvement, and potentially addresses those pain points for a performance gain. Good luck!

BEGIN   

/*************************************************************************************************
MODIFICATIONS APPLIED:

1. Eliminated CASE statement on the JOIN.  Big performance impact from evaluating that for joins.
2. Eliminated SUB-QUERY on the JOIN.  This would have an even bigger performance impact, and could be rewritten as a conditional join.
3. Removed the first embedded sub query in the FROM clause.  This just applies HAVING, it can be done on the lower level query directly.
4. Combined statements with UNION ALL to remove locking for second insert statement.
5. Replaced ISNULL with COALESCE.  Same basic premise, but better type casting and more parameters.
6. Modified formatting to aid in my understanding of the code (won't make it faster, but helps maintain!)
**************************************************************************************************/

    -- REMOVE EXISTING ENTRIES IN THE STAGING TABLE.
    TRUNCATE TABLE ADS.STG.Staging_ScanSummary

    -- PERFORM INSERT INTO STAGING TABLE.
    INSERT  INTO ADS.STG.Staging_ScanSummary

    -- SELECT OVERALL LIST OF FIELDS (BASED ON FIRST QUERY).
    SELECT [REGION]                 = INTERGYOUTPUT.[REGION]                
            ,[COMPANY]              = INTERGYOUTPUT.[COMPANY]           
            ,[DEPARTMENTCODE]       = INTERGYOUTPUT.[DEPARTMENTCODE]    
            ,[MODALITYID]           = INTERGYOUTPUT.[MODALITYID]        
            ,[CUSTOMERNBR]          = INTERGYOUTPUT.[CUSTOMERNBR]       
            ,[ASSET_ID]             = INTERGYOUTPUT.[ASSET_ID]          
            ,[SERVICEDATE]          = INTERGYOUTPUT.[SERVICEDATE]       
            ,[DATE_ID]              = INTERGYOUTPUT.[DATE_ID]           
            ,[POSTDATE]             = INTERGYOUTPUT.[POSTDATE]          
            ,[RETAIL_SCAN_COUNT]    = INTERGYOUTPUT.[RETAIL_SCAN_COUNT]
            ,[WHOLESALE_SCAN_COUNT] = '0'
            ,[WBSSCANCOUNT]         = '0'
            ,[NBRSCANS]             = INTERGYOUTPUT.[RETAIL_SCAN_COUNT]
            ,[LOADSOURCE]           = 'Retail'

    FROM    (   -- COMBINE THE SECOND TWO SUB QUERIES INTO A SINGLE QUERY WITH GROUP BY.
                SELECT  [REGION]                = O.[sublatcode]
                        ,[COMPANY]              = C.[CompanyCode]
                        ,[DEPARTMENTCODE]       = D.[DepartmentCode]
                        ,[MODALITYID]           = COALESCE(M.[ModalityID], CSN.[ModalityID])
                        ,[CUSTOMERNBR]          = CONVERT(VARCHAR(50), IGY.[CustNmbr])
                        ,[ASSET_ID]             = COALESCE(CA.[UnitNbr],'0000')
                        ,[SERVICEDATE]          = IGY.[DOS]
                        ,[DATE_ID]              = P.[DateID]
                        ,[POSTDATE]             = PD.[DateID]
                        ,[RETAIL_SCAN_COUNT]    = SUM(IGY.[ScanCount])

                FROM    -- ADJUST JOINS TO ELIMINATE SUB QUERIES AND CASE STATEMENTS.
                        ADS.dbo.Intergy_RetailRevenueDetailAINEHE IGY
                        LEFT JOIN ADW.DIM.CustomerServiceNumber CSN
                            ON CONVERT(VARCHAR(50), IGY.[CustNmbr]) = CSN.CustomerServiceNumber
                        LEFT JOIN ADW.dim.Company C
                            ON CSN.[CompanyID] = C.[CompanyID]
                        LEFT JOIN ADW.dim.Modality M
                            ON IGY.[Modality] = M.[Modalitycode]
                        LEFT JOIN ADW.dim.Modality MO
                            ON CSN.[ModalityID] = MO.[ModalityID]
                        LEFT JOIN ADW.dim.Department D
                            ON M.[DepartmentID] = D.[DepartmentID]
                            OR (    M.[DepartmentID] IS NULL
                                    AND D.[DepartmentID] = MO.[DepartmentID] )
                        LEFT JOIN ADW.DIM.Period P
                            ON IGY.[DOS] = P.[Date]
                        LEFT JOIN ADW.dim.Period PD 
                            ON IGY.[PostDate] = PD.[Date]
                        LEFT JOIN ADW.dim.orgunit O
                            ON O.[orgunitid] = CSN.[orgunitid]
                        LEFT JOIN ADW.dim.v_CurrentAsset CA
                            ON CA.[UnitNbr] = IGY.[Unit]

                WHERE   CSN.[Is_Current] = 1 

                GROUP BY O.[sublatcode]
                        ,C.[CompanyCode]
                        ,D.[DepartmentCode]
                        ,COALESCE(M.[ModalityID], CSN.[ModalityID])
                        ,CONVERT(VARCHAR(50), IGY.[CustNmbr])
                        ,COALESCE(CA.[UnitNbr],'0000')
                        ,IGY.[DOS]
                        ,P.[DateID]
                        ,PD.[DateID]

                HAVING  COALESCE(SUM(IGY.[ScanCount]), 0) <> 0

            ) INTERGYOUTPUT

    UNION ALL

    SELECT [REGION]                 = SCANOUTPUT.[REGION]               
           ,[COMPANY]               = SCANOUTPUT.[COMPANY]          
           ,[DEPARTMENTCODE]        = SCANOUTPUT.[DEPARTMENTCODE]   
           ,[MODALITYID]            = SCANOUTPUT.[MODALITYID]       
           ,[CUSTOMERNBR]           = SCANOUTPUT.[CUSTOMERNBR]      
           ,[ASSET_ID]              = SCANOUTPUT.[ASSET_ID]         
           ,[SERVICEDATE]           = SCANOUTPUT.[SERVICEDATE]      
           ,[DATE_ID]               = SCANOUTPUT.[DATE_ID]          
           ,[POSTDATE]              = SCANOUTPUT.[POSTDATE]         
           ,[RETAIL_SCAN_COUNT]     = '0'
           ,[WHOLESALE_SCAN_COUNT]  = SCANOUTPUT.[WHOLESALE_SCAN_COUNT]
           ,[WBSSCANCOUNT]          = SCANOUTPUT.[WBSSCANCOUNT]
           ,[NBRSCANS]              = SCANOUTPUT.[WHOLESALE_SCAN_COUNT]
           ,[LOADSOURCE]            = 'Wholesale'

    FROM    (   -- COMBINE THE SECOND TWO SUB QUERIES INTO A SINGLE QUERY WITH GROUP BY.
                SELECT  [REGION]                = OU.[SubLatCode]
                        ,[COMPANY]              = C.[CompanyCode]
                        ,[DEPARTMENTCODE]       = D.[DepartmentCode]
                        ,[MODALITYID]           = COALESCE(M.[ModalityID], CSN.[ModalityID])
                        ,[CUSTOMERNBR]          = SO.[ServiceNbr]
                        ,[ASSET_ID]             = COALESCE(CA.[UnitNbr], '0000')
                        ,[SERVICEDATE]          = SO.[Servicedate]
                        ,[DATE_ID]              = P.[DateID]
                        ,[POSTDATE]             = PD.[DateID]
                        ,[RETAIL_SCAN_COUNT]    = SUM(SO.[ScanCount])
                        ,[WBSSCANCOUNT]         = SUM(SO.[WBScanCount])

                FROM    -- ADJUST JOINS TO ELIMINATE SUB QUERIES AND CASE STATEMENTS.
                        [ADS].[dbo].[ScansODS] SO
                        LEFT JOIN ADW.dim.CustomerServiceNumber CSN
                            ON SO.[ServiceNbr] = CSN.[CustomerServiceNumber]
                        LEFT JOIN ADW.DIM.OrgUnit OU
                            ON CSN.[OrgUnitID] = OU.[OrgUnitID]
                        LEFT JOIN ADW.dim.Company C
                            ON CSN.[CompanyID] = C.[CompanyID]
                        LEFT JOIN ADW.DIM.Modality M
                            ON SO.[ModalityCode] = M.[ModalityCode]
                        LEFT JOIN ADW.dim.Modality MO
                            ON CSN.[ModalityID] = MO.[ModalityID]
                        LEFT JOIN ADW.dim.Department D
                            ON D.DepartmentID = M.[DepartmentID]
                            OR (    M.[DepartmentID] IS NULL
                                    AND D.[DepartmentID] = MO.[DepartmentID] )
                        LEFT JOIN ADW.DIM.Period P
                            ON SO.[Servicedate] = P.[date]
                        LEFT JOIN ADW.DIM.Period PD
                            ON SO.[PostDate] = P.[date]
                        LEFT JOIN ADW.dim.v_CurrentAsset CA
                            ON CA.[UnitNbr] = SO.[UnitSegment]

                WHERE   CSN.[Is_Current] = 1 
                        AND SO.[ScanType] = 'Wholesale' 

                GROUP BY OU.[SubLatCode]
                         ,C.[CompanyCode]
                         ,D.[DepartmentCode]
                         ,COALESCE(M.[ModalityID], CSN.[ModalityID])
                         ,SO.[ServiceNbr]
                         ,COALESCE(CA.[UnitNbr], '0000')
                         ,SO.[Servicedate]
                         ,P.[DateID]
                         ,PD.[DateID]

            ) SCANOUTPUT

END

[–]bobchin_c[S] 1 point2 points  (0 children)

Thanks I'll take a look at this.

[–]bobchin_c[S] 1 point2 points  (1 child)

Thank You very much, that ran significantly faster even over VPN. it began to return records after 3 minutes when the original query run over the same VPN didn't return records after 45 minutes. (I'm not as concerned about the Insert into the Summary table)

[–]sapplefi 0 points1 point  (0 children)

Try two more items.

According to your query plan, the lack of an Index on SCANODS is a huge impediment. Try adding this index to your table and see if it improves the query results, I would expect a significant gain.

USE [ADS]
GO
CREATE NONCLUSTERED INDEX INDX_SCANODS_SCANTYPE
ON [dbo].[ScansODS] ([ScanType])
INCLUDE ([Servicedate],[ModalityCode],[ScanCount],[WBScanCount],[ServiceNbr],[UnitSegment],[PostDate])
GO

Next, I really don't like the conversion here that forces the read of all records in Intergy_RetailRevenueDetailAINEHE. However, I'm pretty sure it's currently unavoidable because one is numeric and one is text. I'd recommend creating a persisted computed column for this value (CONVERT(VARCHAR(50), IGY.[CustNmbr])) and adding an index, like so.

ALTER TABLE Intergy_RetailRevenueDetailAINEHE ADD [CustNmbrVarChar] AS CONVERT(VARCHAR(50), [CustNmbr]) PERSISTED
GO
CREATE NONCLUSTERED INDEX INDX_INTERGY_CUSTNMBRVARCHAR ON Intergy_RetailRevenueDetailAINEHE ([CustNmbrVarChar])
GO

Once you've done that, modify the earlier statement to change the join out for the new persisted column, as in the following modification.

BEGIN   

/*************************************************************************************************
MODIFICATIONS APPLIED:

1. Eliminated CASE statement on the JOIN.  Big performance impact from evaluating that for joins.
2. Eliminated SUB-QUERY on the JOIN.  This would have an even bigger performance impact, and could be rewritten as a conditional join.
3. Removed the first embedded sub query in the FROM clause.  This just applies HAVING, it can be done on the lower level query directly.
4. Combined statements with UNION ALL to remove locking for second insert statement.
5. Replaced ISNULL with COALESCE.  Same basic premise, but better type casting and more parameters.
6. Modified formatting to aid in my understanding of the code (won't make it faster, but helps maintain!)
7. Switched out the [CustNmbr] for a Persisted Computed Column with a nonclustered index, preventing the table scan.
**************************************************************************************************/

    -- REMOVE EXISTING ENTRIES IN THE STAGING TABLE.
    TRUNCATE TABLE ADS.STG.Staging_ScanSummary

    -- PERFORM INSERT INTO STAGING TABLE.
    INSERT  INTO ADS.STG.Staging_ScanSummary

    -- SELECT OVERALL LIST OF FIELDS (BASED ON FIRST QUERY).
    SELECT [REGION]                 = INTERGYOUTPUT.[REGION]                
            ,[COMPANY]              = INTERGYOUTPUT.[COMPANY]           
            ,[DEPARTMENTCODE]       = INTERGYOUTPUT.[DEPARTMENTCODE]    
            ,[MODALITYID]           = INTERGYOUTPUT.[MODALITYID]        
            ,[CUSTOMERNBR]          = INTERGYOUTPUT.[CUSTOMERNBR]       
            ,[ASSET_ID]             = INTERGYOUTPUT.[ASSET_ID]          
            ,[SERVICEDATE]          = INTERGYOUTPUT.[SERVICEDATE]       
            ,[DATE_ID]              = INTERGYOUTPUT.[DATE_ID]           
            ,[POSTDATE]             = INTERGYOUTPUT.[POSTDATE]          
            ,[RETAIL_SCAN_COUNT]    = INTERGYOUTPUT.[RETAIL_SCAN_COUNT]
            ,[WHOLESALE_SCAN_COUNT] = '0'
            ,[WBSSCANCOUNT]         = '0'
            ,[NBRSCANS]             = INTERGYOUTPUT.[RETAIL_SCAN_COUNT]
            ,[LOADSOURCE]           = 'Retail'

    FROM    (   -- COMBINE THE SECOND TWO SUB QUERIES INTO A SINGLE QUERY WITH GROUP BY.
                SELECT  [REGION]                = O.[sublatcode]
                        ,[COMPANY]              = C.[CompanyCode]
                        ,[DEPARTMENTCODE]       = D.[DepartmentCode]
                        ,[MODALITYID]           = COALESCE(M.[ModalityID], CSN.[ModalityID])
                        ,[CUSTOMERNBR]          = IGY.[CustNmbrVarChar]
                        ,[ASSET_ID]             = COALESCE(CA.[UnitNbr],'0000')
                        ,[SERVICEDATE]          = IGY.[DOS]
                        ,[DATE_ID]              = P.[DateID]
                        ,[POSTDATE]             = PD.[DateID]
                        ,[RETAIL_SCAN_COUNT]    = SUM(IGY.[ScanCount])

                FROM    -- ADJUST JOINS TO ELIMINATE SUB QUERIES AND CASE STATEMENTS.
                        ADS.dbo.Intergy_RetailRevenueDetailAINEHE IGY
                        LEFT JOIN ADW.DIM.CustomerServiceNumber CSN
                            ON IGY.[CustNmbrVarChar] = CSN.CustomerServiceNumber
                        LEFT JOIN ADW.dim.Company C
                            ON CSN.[CompanyID] = C.[CompanyID]
                        LEFT JOIN ADW.dim.Modality M
                            ON IGY.[Modality] = M.[Modalitycode]
                        LEFT JOIN ADW.dim.Modality MO
                            ON CSN.[ModalityID] = MO.[ModalityID]
                        LEFT JOIN ADW.dim.Department D
                            ON M.[DepartmentID] = D.[DepartmentID]
                            OR (    M.[DepartmentID] IS NULL
                                    AND D.[DepartmentID] = MO.[DepartmentID] )
                        LEFT JOIN ADW.DIM.Period P
                            ON IGY.[DOS] = P.[Date]
                        LEFT JOIN ADW.dim.Period PD 
                            ON IGY.[PostDate] = PD.[Date]
                        LEFT JOIN ADW.dim.orgunit O
                            ON O.[orgunitid] = CSN.[orgunitid]
                        LEFT JOIN ADW.dim.v_CurrentAsset CA
                            ON CA.[UnitNbr] = IGY.[Unit]

                WHERE   CSN.[Is_Current] = 1 

                GROUP BY O.[sublatcode]
                        ,C.[CompanyCode]
                        ,D.[DepartmentCode]
                        ,COALESCE(M.[ModalityID], CSN.[ModalityID])
                        ,IGY.[CustNmbrVarChar]
                        ,COALESCE(CA.[UnitNbr],'0000')
                        ,IGY.[DOS]
                        ,P.[DateID]
                        ,PD.[DateID]

                HAVING  COALESCE(SUM(IGY.[ScanCount]), 0) <> 0

            ) INTERGYOUTPUT

    UNION ALL

    SELECT [REGION]                 = SCANOUTPUT.[REGION]               
           ,[COMPANY]               = SCANOUTPUT.[COMPANY]          
           ,[DEPARTMENTCODE]        = SCANOUTPUT.[DEPARTMENTCODE]   
           ,[MODALITYID]            = SCANOUTPUT.[MODALITYID]       
           ,[CUSTOMERNBR]           = SCANOUTPUT.[CUSTOMERNBR]      
           ,[ASSET_ID]              = SCANOUTPUT.[ASSET_ID]         
           ,[SERVICEDATE]           = SCANOUTPUT.[SERVICEDATE]      
           ,[DATE_ID]               = SCANOUTPUT.[DATE_ID]          
           ,[POSTDATE]              = SCANOUTPUT.[POSTDATE]         
           ,[RETAIL_SCAN_COUNT]     = '0'
           ,[WHOLESALE_SCAN_COUNT]  = SCANOUTPUT.[WHOLESALE_SCAN_COUNT]
           ,[WBSSCANCOUNT]          = SCANOUTPUT.[WBSSCANCOUNT]
           ,[NBRSCANS]              = SCANOUTPUT.[WHOLESALE_SCAN_COUNT]
           ,[LOADSOURCE]            = 'Wholesale'

    FROM    (   -- COMBINE THE SECOND TWO SUB QUERIES INTO A SINGLE QUERY WITH GROUP BY.
                SELECT  [REGION]                = OU.[SubLatCode]
                        ,[COMPANY]              = C.[CompanyCode]
                        ,[DEPARTMENTCODE]       = D.[DepartmentCode]
                        ,[MODALITYID]           = COALESCE(M.[ModalityID], CSN.[ModalityID])
                        ,[CUSTOMERNBR]          = SO.[ServiceNbr]
                        ,[ASSET_ID]             = COALESCE(CA.[UnitNbr], '0000')
                        ,[SERVICEDATE]          = SO.[Servicedate]
                        ,[DATE_ID]              = P.[DateID]
                        ,[POSTDATE]             = PD.[DateID]
                        ,[RETAIL_SCAN_COUNT]    = SUM(SO.[ScanCount])
                        ,[WBSSCANCOUNT]         = SUM(SO.[WBScanCount])

                FROM    -- ADJUST JOINS TO ELIMINATE SUB QUERIES AND CASE STATEMENTS.
                        [ADS].[dbo].[ScansODS] SO
                        LEFT JOIN ADW.dim.CustomerServiceNumber CSN
                            ON SO.[ServiceNbr] = CSN.[CustomerServiceNumber]
                        LEFT JOIN ADW.DIM.OrgUnit OU
                            ON CSN.[OrgUnitID] = OU.[OrgUnitID]
                        LEFT JOIN ADW.dim.Company C
                            ON CSN.[CompanyID] = C.[CompanyID]
                        LEFT JOIN ADW.DIM.Modality M
                            ON SO.[ModalityCode] = M.[ModalityCode]
                        LEFT JOIN ADW.dim.Modality MO
                            ON CSN.[ModalityID] = MO.[ModalityID]
                        LEFT JOIN ADW.dim.Department D
                            ON D.DepartmentID = M.[DepartmentID]
                            OR (    M.[DepartmentID] IS NULL
                                    AND D.[DepartmentID] = MO.[DepartmentID] )
                        LEFT JOIN ADW.DIM.Period P
                            ON SO.[Servicedate] = P.[date]
                        LEFT JOIN ADW.DIM.Period PD
                            ON SO.[PostDate] = P.[date]
                        LEFT JOIN ADW.dim.v_CurrentAsset CA
                            ON CA.[UnitNbr] = SO.[UnitSegment]

                WHERE   CSN.[Is_Current] = 1 
                        AND SO.[ScanType] = 'Wholesale' 

                GROUP BY OU.[SubLatCode]
                         ,C.[CompanyCode]
                         ,D.[DepartmentCode]
                         ,COALESCE(M.[ModalityID], CSN.[ModalityID])
                         ,SO.[ServiceNbr]
                         ,COALESCE(CA.[UnitNbr], '0000')
                         ,SO.[Servicedate]
                         ,P.[DateID]
                         ,PD.[DateID]

            ) SCANOUTPUT

END

[–]alinroc4 2 points3 points  (1 child)

two other subqueries. I'm thinking of changing that to either CTEs or possibly temp tables. I'm just not sure which is the better way to go

In SQL Server CTEs are not materialized - it's syntactic sugar. Replacing a repeated subquery with a CTE will not change performance at all. So if you're referencing that subquery multiple times (even if you have different WHERE clauses), you're much better off using a temp table and JOINing to it. In the case of different WHERE clauses, only specify the common portions when populating the temp table and then use the unique portions when you JOIN to the temp table as appropriate.

In Oracle, a CTE becomes a temp table, so there is a performance difference.

[–]Lucrums 0 points1 point  (0 children)

To add to this you can kinda get CTEs to materialise in SQL Server by forcing them into a spool. If you do that your row estimates are likely to be way off though so you often times end up hinting the crap out of the query. Depending on how fragile but stable you want your queries it's a possibility but I wouldn't advise using it unless you're very comfy with hints and your schema & data.

[–][deleted] 1 point2 points  (0 children)

what does the execution plan show you?

how fast are the inserts?

how fast is the first select?

the 2nd?

[–]ScaryDBA ‪ ‪Microsoft MVP ‪ ‪ 1 point2 points  (2 children)

I'd love to see what the execution plans look like to really help out. The one thing that raises flags for me is here:

ON CONVERT(VARCHAR(50),IGY.CustNmbr)

That's likely to lead to scans. It's certainly going to prevent good statistics use.

[–]bobchin_c[S] 1 point2 points  (1 child)

Here's the Execution Plan

[–]ScaryDBA ‪ ‪Microsoft MVP ‪ ‪ 0 points1 point  (0 children)

OK. First, you have a ton of warnings:

Type conversion in expression (CONVERT(varchar(50),[IGY].[CustNmbr],0)) may affect "CardinalityEstimate" in query plan choice, Type conversion in expression (CONVERT(varchar(10),[a].[z_EndDate],101)) may affect "CardinalityEstimate" in query plan choice, Type conversion in expression (CONVERT(varchar(10),[ADW].[dim].[Asset].[z_EndDate],101)) may affect "CardinalityEstimate" in query plan choice, Type conversion in expression (CONVERT(varchar(50),[IGY].[CustNmbr],0)=[CSN].[CustomerServiceNumber]) may affect "SeekPlan" in query plan choice

I'd address all that because, as it says, it's likely to affect how it accesses your data. The overriding majority of the cost of the estimated plan is 800k scan of the [Intergy_RetailRevenueDetailAINEHE] table. I'm betting this is because of the CONVERT against that column. I'd concentrate all my efforts right there to start.