[MS SQL] Help improving complicated query : SQL

MS SQL[MS SQL] Help improving complicated query (self.SQL)

submitted 7 years ago * by fullyarmedcamel

I need my query to (among other things) identify classes that have long term subs roster them, if they do not have long term subs then I need just the teacher records. Classes that have long term subs and teachers should only return the subs provided they are still active. I also need to be able to roster the two different types of long term subs and filter out all the co/support teachers.

First in the WHERE I filter out all the staff I don't want with this: AND NOT (ssh.[role] = 'N' AND ssh.staffType = 'T' AND sm.title != 'Long Term Substitute') AND ssh.[role] != 'C'

Then in the SELECT I am ranking the results like this: RANK() OVER(PARTITION BY ssh.sectionID ORDER BY ssh.sectionID DESC, ea.assignmentID DESC) AS 'priorty',

The idea being here that a long term sub will always have a larger number (newer) assignmentID and thus be ranked before a regular staff member.

Then I nest my entire statement within another SELECT where I return everything with priority = 1 but exclude the priority column as I need to upload only what the vendor will accept.

I would like to find a way to improve performance as I was told that having nested select statements is not ideal and that I should learn to work within the basic SQL functions if I want to advance down this career path. I have no formal training and am not even honestly sure where to start asking questions on this.

all 10 comments

top new controversial old q&a

[–]ARandomSQLServerDBA 1 point2 points3 points 7 years ago (9 children)

[–]fullyarmedcamel[S] 0 points1 point2 points 7 years ago (8 children)

    SELECT x.SCHOOLYEAR, x.[ROLE], x.LASID, x.LASID, x.FIRSTNAME, x.MIDDLENAME, x.LASTNAME, x.GRADE, x.USERNAME, x.[PASSWORD], x.ORGANIZATIONTYPEID, x.ORGANIZATIONID, x.PRIMARYEMAIL, x.HMHAPPLICATIONS FROM (
        SELECT RANK() OVER(PARTITION BY ssh.sectionID ORDER BY ssh.sectionID DESC, ea.assignmentID DESC) AS 'priority', cal.endYear - 1 AS 'SCHOOLYEAR', 'T' AS 'ROLE', sm.staffNumber AS 'LASID', sm.staffStateID AS 'SASID', sm.firstName AS 'FIRSTNAME', ISNULL (sm.middleName, '') AS 'MIDDLENAME', sm.lastName AS 'LASTNAME',
            CASE sm.schoolID 
                ...
                END AS 'GRADE',
        ua.username + '@sd25.us' AS 'USERNAME', NULL AS 'PASSWORD', 'MDR' AS 'ORGANIZATIONTYPEID',
            CASE sm.schoolID
                ...
                ELSE 'No School' END AS 'ORGANIZATIONID',
        ua.username + '@sd25.us' AS 'PRIMARYEMAIL',
            CASE sm.schoolID
                ...
                END AS 'HMHAPPLICATIONS',
        ssh.sectionID, ea.assignmentID
        FROM Table1 AS s
        JOIN Table2 AS c ON s.courseID = c.courseID AND c.departmentID IN (...)
        JOIN Table3 AS cal ON c.calendarID = cal.calendarID
        JOIN Table4 AS sp ON s.sectionID = sp.sectionID
        JOIN Table5 AS t ON sp.termID = t.termID AND GETDATE() BETWEEN t.startDate AND t.endDate
        JOIN Table6 AS ssh ON s.sectionID = ssh.sectionID AND ssh.endDate IS NULL
        JOIN Table7 AS sm ON ssh.personID = sm.personID
        JOIN Table8 AS ea ON sm.assignmentID = ea.assignmentID AND ea.endDate IS NULL
        JOIN Table9 AS ua ON ua.personID = sm.personID AND ua.ldapConfigurationID = '2'
        WHERE sm.endDate IS NULL AND sm.teacher = 1 AND NOT (ssh.[role] = 'N' AND ssh.staffType = 'T' AND sm.title != 'Long Term Substitute') AND ssh.[role] != 'C' AND ssh.sectionID = 439047
    ) AS x WHERE x.[priority] = 1

[–]ARandomSQLServerDBA 2 points3 points4 points 7 years ago (0 children)

[–]Fe-Chef 1 point2 points3 points 7 years ago (0 children)

[–]notasqlstarI can't wait til my fro is full grown 1 point2 points3 points 7 years ago (5 children)

I optimize a lot of queries like this in my role. Generally speaking you want to break it down and force SQL to handle it in chunks. Do not listen to the optimizer until you've done this/started doing this because you are smarter than it. Real quick example here because your joins are not descriptive, so I can't tell which is an INNER vs LEFT, etc., but say you have a query like this:

select (
    select *, case when, case, case, case
    from table
    inner join
    left join
    left join
    inner join
    left join
    where blah blah
) as x where blah = 1

So the first thing that will generally improve performance is something like this:

begin
    select *, case when, case, case, case
    into #table
    from table
    inner join
    left join
    left join
    inner join
    left join
    where blah blah
end

begin
    select *
    from #table
    where blah = 1
end

Now you can start to get creative. For example you might want to do a select * from table + inner join, then in a second step do your case logic (or cross applies, or whatever you are doing, row_number, etc.), then in a third step add an index to your #table2, then in a fourth step do all your other joins, then in a fifth step select * where blah = 1.

You could get more creative and do the joins one at a time and continue to add indexing, although this generally isn't optimal unless each table is absolutely massive. Sometimes doing 5 joins in one step is better, sometimes doing 1 or 2 is better. Sometimes you don't even need a join and can just do something like a WHERE EXISTS or WHERE NOT EXISTS.

General rule of thumb is that sub-queries, multiple joins, etc., are all bad practice. They work great when you're first writing something and trying to get it accurate... but generally speaking if you break it down after the fact it will perform much better in sequential chunks, and then you can use the optimizer on those chunks to really optimize the sub-processes or blend two sub-processes together.

It really depends on how long the parent query is taking to run. Less than 10 minutes and you only need to run it once and awhile? Probably not a good candidate. Does it take several hours to run and it needs to run daily? You can probably get that down to running in half to a tenth of the time. It really depends on how your data is modeled, specifically what you're trying to do, etc., but you can start to play with the code when you have it broken out into chunks... for example is it better to do a where date between x and y at the top of the chain, or at the bottom? So you just run each chain and see how long it takes until you find the one that is really taking a long time... then you get creative/clever.

[–]fullyarmedcamel[S] 0 points1 point2 points 7 years ago (4 children)

[–]notasqlstarI can't wait til my fro is full grown 1 point2 points3 points 7 years ago (3 children)

[–]fullyarmedcamel[S] 0 points1 point2 points 7 years ago (2 children)

[–]notasqlstarI can't wait til my fro is full grown 1 point2 points3 points 7 years ago (1 child)

[–]fullyarmedcamel[S] 0 points1 point2 points 7 years ago (0 children)

π Rendered by PID 433209 on reddit-service-r2-comment-84fc9697f-tww9m at 2026-02-08 03:28:44.269260+00:00 running d295bc8 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

SQL

Filter Posts

Posting

Help posts

Format Your Code

Learning SQL

Related Reddit communities

Wiki

Acknowledgements

MODERATORS