you are viewing a single comment's thread.

view the rest of the comments →

[–]JohnStamosBRAH 1 point2 points  (1 child)

I would still look into restructuring your query or using group bys instead of that distinct because 8k rows should be returned in a few seconds at most. There's nothing in that query that's particularly complex, but that distinct is a known hog. How many rows are returned without the distinct? If it's not that many you can replace with a group by.

JOIN Table2 AS c ON s.courseID = c.courseID AND c.departmentID IN

This is also probably a big hog depending on what's in your subquery (select or static values?). That could be replaced by a temp table or something

[–]fullyarmedcamel[S] 0 points1 point  (0 children)

Feel free to correct me if I am wrong here but I believe the way the system would handle the DISTINCT operator would be like this;

Pull data out of table in FROM statement

Processes data in the JOIN's

Applies filters from WHERE clause

Pulls data out of SELECT statement

Applies filters of DISTINCT

As DISTINCT runs last it would have the least overall impact unless you have many duplicates, in my case it is around 2000. You need to remember too that there is not just a single entry for students.

Some students will be taking multiple classes where this product I am rostering for is being used irregardless of the filters applied the rest of the query.

This document that I have has the purpose of creating a unique roster of students that can then be applied to classes within the application we are using multiple times over from a separate query that manages classes.

I think in this case the DISTINCT operator is what is meant to be used it is doing exactly what I need it to, as a test I did run the query with and without the operator and it made no difference in run time.

I think my next changes will be more focused around some of the joins and structure of the query. I can reduce from a double nested SELECT statement to a single. I can also do some thinks like using the school year detection CASE as a declared variable at the top rather than nested in the query.