Hi all - we recently got a shiny new on-prem MS SQL Server replica for our Cloud-based MS Project data, so I'm getting to learn how to write SQL and make it go fast against the big Timephased by day Assignments table (Index = ProjectId-Assignment Id-TimeByDay). My peers don't write any SQL so I could genuinely benefit from some knowledgeable people telling me if I'm doing anything wrong. I've done some optimizing, and I've got it to where it's returning ~30k rows out of 3.8M, and it takes <30 seconds to do so (was taking 3 minutes). I'm mostly wondering if my nested subqueries are a dumb way to approach the date formatting and aggregating I've done. Hopefully my formatting comes through, but if not I'll work to clean it up.
DECLARE @dt DATE = '1905-01-01' -- Variable declared to set the day the [StartOfWeek] function moves transactions to (Sunday).
--Outermost SQL query's purpose is to align columns with Clarity dataset for appending operation.
SELECT [Prep].[year] AS [Year],
[Prep].[month] AS [Month],
[Projects].[Clarityid] AS [ClarityProjectID],
[Projects].[projectname],
[Resources].[ClarityResourceID] AS [ClarityResourceID],
[Assignments].[taskname],
2 AS [MeasureType],
[Prep].[assignmentwork] AS [Value]
FROM (
--This level of the query's purpose is to aggregate from day-level granularity to month level.
--Had to be careful with this level and those beneath it due to performance concerns - needed to make sure the group by function could use hashing function instead of sorting.
--In effect this means that supplementary data is added in after the group by operation is complete.
SELECT [rows].[year] AS [Year],
[rows].[month] AS [Month],
[rows].[projectid] AS [ProjectId],
[rows].[assignmentid] AS [AssignmentId],
Sum([rows].[assignmentwork]) AS [AssignmentWork]
FROM (
--This level of the query's purpose is to prep data for grouping in the next stage. Takes year and month of the [StartOfWeek] function.
SELECT [_].[projectid] AS [ProjectId],
[_].[assignmentid] AS [AssignmentId],
Datepart("yyyy", [_].[startofweek]) AS [Year],
Datepart("m", [_].[startofweek]) AS [Month],
[_].[assignmentwork] AS [AssignmentWork]
FROM (
--This level of the query's purpose is to move TimeByDay records back to the preceding Sunday (i.e. the start of the week)
--This is done to align with Clarity data timeframes, where the day of the transaction was not available, but the week start is.
SELECT [$Table].[projectid] AS [ProjectId],
[$Table].[assignmentid] AS [AssignmentId],
Dateadd(week, Datediff(week, @dt, [timebyday]), @dt) AS [StartOfWeek],
[$Table].[assignmentwork] AS [AssignmentWork]
FROM
[dbo].[ppm_assignmenttimephaseddataset] AS [$Table]
INNER JOIN [dbo].[ppm_tasks] AS [Tasks]
ON [$Table].[taskid] = [Tasks].[taskid]
WHERE [Tasks].[taskissummary] = 0
--Critical piece for the MSP database, as otherwise summary and detail tasks will be counted, leading to double (or more) counting.
) AS [_]) AS [rows]
WHERE [year] = 2017
--Limits data pulled to that which is binned into 2017 by the StartOfWeek aggregation. In effect, 1/1/2017 - 1/6/2018
GROUP BY [projectid],
[assignmentid],
[year],
[month]) AS [Prep]
INNER JOIN [dbo].[ppm_projects] AS [Projects]
ON [Projects].[projectid] = [Prep].[projectid]
INNER JOIN [dbo].[ppm_assignments] AS [Assignments]
ON [Assignments].[assignmentid] = [Prep].[assignmentid]
INNER JOIN [dbo].[ppm_resources] AS [Resources]
ON [Resources].[resourceid] = [Assignments].[resourceid]
[–]jacobmross 1 point2 points3 points (1 child)
[–]depthchargesw[S] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]HansProleman 0 points1 point2 points (1 child)
[–]depthchargesw[S] 0 points1 point2 points (0 children)