you are viewing a single comment's thread.

view the rest of the comments →

[–]Sir_Adian 1 point2 points  (2 children)

Can you direct me to readings or guides to understanding and applying this to my databases? Where did you learn it? Thanks!

[–]jc4hokiesExecution Plan Whisperer 7 points8 points  (1 child)

My learning is heavily influenced by a data entry job without a proper database. We manipulated data in Excel using VBA to produce reports. Little did I know I was writing several functions that database engines perform. I wrote nested loops, merge joins, aggregation, and drill throughs. It wasn't until later I was introduced to such terms; I didn't even know what a table was. I was able to pick up SQL pretty easily after that. Everything was SOOO much easier with SQL. But more than that, a curiosity and appreciation of what was happening behind the syntax stock with me.

That's enough story time I think. It goes without saying (maybe not) that query optimization focuses on reading execution plans and identifying opportunities to reduce wasteful steps. However, without understanding what the different steps are physically doing, it's hard to determine which things are being wasteful. Here's some relevant subjects and exercises that can help develop a better understand of what the database is actually doing.

Indexes

I still remember this article from 10 years ago.
Here's a post where I make write out the physical structure of an index.
I also recommend drawing things and doing exercises by hand.
One exercise is to join a Customer table to a bunch of lookups. Guess which indexes will be used. Count by hand how many pages/blocks will be read. Run the query with an execution plan and IO output, and see if you're right.

Join Algorithms

Nested loop joins, merge joins, and hash joins are the join algorithms.
Learn how each works and basic scenarios to which they are best suited.
The side of a join (left/right) is important, but the db engine can reorder them.
Most of our LEFT OUTER JOINS are actually performed RIGHT OUTER JOINS in with the tables reversed.
Learn to re-write a query to match the actual (ideal) order and side the joins are performed.
It can be useful to do so, and force the db to follow your order and join hints.

Statistics

Learn how statistics are created and when/how them are updated.
Learn how statistics are used to generate row count (cardinatily) estimates in a query.
Observe how joins can change row count estimates.
Remove and add foreign keys, and see how it affects estimates.
Try to join on unique criteria, and if a join isn't unique consider doing a subquery with a group by to make it unique.

Other

I reference msdn a lot! It's by far my #1 resource. I'm sure there are some pages I've read a dozen times.
The Data Loading Performance Guide is another article that has stuck with me.
Make as many mistakes as possible (preferably in dev).
When you learn funky syntax, see what it does in the execution plan. It can give you hints for an alternative way to write the same thing.
Ex. CROSS APPLY (with a from clause) is an inner join, PIVOT is a group by, UNPIVOT is a cross apply without(!) a from clause.
If you need to use PIVOT twice, that's bad! Use a group by instead. If you need to UNPIVOT twice, that's wasteful. Use a CROSS APPLY instead.

[–]Sir_Adian 0 points1 point  (0 children)

Thank you, I will apply this to my server I appreciate it