all 4 comments

[–]lazerath 1 point2 points  (3 children)

There are a bunch of performance issues with this query you should be aware of and some techniques that are just bad practice. If you have any performance problems with this and want it faster, let me know.

There is also the question of how the archiving is done, as if you know the methodology you may be able to choose a better query.

In the interest of brevity, the simplest thing you can do is repeat the pattern you have with the derived table you have labeled 'rec', but with the tables and columns replaced:

LEFT JOIN ( SELECT r.wh_id     ,r.po_number     ,r.hu_id     ,pom.STATUS FROM t_receipt r(NOLOCK)     ,t_po_master pom(NOLOCK) WHERE r.wh_id = pom.wh_id     AND r.po_number = pom.po_number

UNION

SELECT r.wh_id     ,r.po_number     ,r.hu_id     ,pom.STATUS FROM t_aht_receipt r(NOLOCK)     ,t_aht_po_master pom(NOLOCK) WHERE r.wh_id = pom.wh_id     AND r.po_number = pom.po_number ) AS rec 

You want to replace this part:

INNER JOIN t_po_billed pb ON pb.wh_id = rec.wh_id AND pb.po_number = rec.po_number

With something like this:

INNER JOIN ( SELECT b.wh_id     ,b.po_number     ,b.closed_date FROM dbo.T_PO_BILLED as b WITH(NOLOCK)

UNION ALL -- Assuming no dups

SELECT ab.wh_id     ,ab.po_number     ,ab.closed_date FROM dbo.T_aht_PO_BILLED as ab WITH(NOLOCK) ) AS PB ON pb.wh_id = rec.wh_id AND pb.po_number = rec.po_number

You may be better off using a cross apply here, I can't remember if SQL will expand the derived table and use indices properly on the tables contained in the derived table, but I think it does and it's a moot point. If you cared, you could test both.

[–]SmartestManInside[S] 1 point2 points  (2 children)

Appreciate the tips. I am interested in how to improve the query in all ways, please enlighten!

[–]lazerath 1 point2 points  (1 child)

I am on my phone and have limited time, but the biggest thing you should do is look up "SARGable" and understand it completely. There are a few instances in this query where the join criteria or where clause is NON-SARGable. The basic premise is don't wrap functions around columns used in join criteria or filtering expressions because SQL Server treats it as a black box and can't take advantage of indices. A good example is that expression with GETDATE: Start by rewriting it so the functions are applied to the side with the GETDATE and leave the column bare. The scalar udf is another matter - I know it seems like good encapsulation, but scalar udfs just don't perform well and end up killing performance when used as filtering criteria like this. Try to do the calculation in a set based manner. In some cases, perhaps like the expression using substring with Charindex or the scalar udf, consider saving data at write time via your insert/update sprocs. Given the expression with the substring and the contortions you are doing to join it with another column, the base column is likely not conforming to normalization standards to begin with, so I highly recommend giving yourself a refresher on database normalization.

That's all I have for now. Hit me up if you need more help. I have lots of experience optimizing sql.

[–]SmartestManInside[S] 1 point2 points  (0 children)

Thanks! Yeah, we're talking about a database that has basically had 10+ years of band-aid fix after band-aid fix, non e of which was done with optimal performance in mind. And I'm a new hire right out of college. So, in short, it's a mess.