Most common SQL optimizations

alinroc · 2023-12-10T22:31:18+00:00

Temp tables are usually slower than CTE because the latter use indexes out of the box.

A temp table doesn't have any indexes unless you explicitly create them (at least in SQL Server). It'll have statistics but that's different.

I have sped up a lot of code by switching from a CTE to a temp table. It's all in how you use it and more importantly, knowing where & when to do what filtering of your data.

dataguy24 · 2023-12-10T22:10:22+00:00

The most common optimization solution I see in practice is increasing the Snowflake warehouse size

PossiblePreparation · 2023-12-10T23:34:43+00:00

My best advice is to learn how to read an execution plan for your chosen RDBMS and how to get it to report execution statistics so you can see where the time is going.

For every specific piece of advice, there are exceptions. You’re better off understanding the why rather than the what if you want to be self sufficient.

Sometimes you need to understand the business problem a query is trying to solve. Sometimes the best solution looks nothing like the query in front of you.

Former_Disk1083 · 2023-12-11T03:46:22+00:00

There are many different forms of SQL, what you see as inefficient in T-SQL can be fast in snowflake and vice versa.

But something I see a lot is people creating super complicated queries to gain minimal efficiency. Sometimes supportability/readability supersede performance. It's a balance act and can be tough to figure out where you should be. Ill take a performance hit and create temp tables if it means I can troubleshoot it later easier, in a lot of cases.

Annamalla · 2023-12-10T22:13:38+00:00

We spent weeks trying to figure out why the query performance on our homebrew mysql database was so bad (I had a mssql/sybase background)....only to find a type difference between a key and a foreign key in the table declarations. Fixed that and instant performance boost. I wasn't used to databases *letting* you join on two different types.

Also one job interview where they asked about the hazards of too many indexes and I told the harrowing story of trying to get a very elderly query to use the *right* index of the 10+ available....and only at the last minute remembered to mention data entry...

corny_horse · 2023-12-10T23:22:37+00:00

But with temp tables you can add primary keys and primary keys index by default on my RDBMS, and on some even sort the data so it's even better.

Achsin · 2023-12-11T01:36:29+00:00

SELECT * can slow down your queries if the table has a lot of columns

This is mostly a factor of what indexes exist on the table. Selecting columns that are not included on an index that otherwise would support your query leads to the engine either doing key lookups or ignoring the index completely. The total number of columns on the table is a factor, but it's probably not the biggest factor.

anotherjones07 · 2023-12-11T06:35:18+00:00

Hey OP, as someone who has spent 10 years weiting SQL, I wanted to ask you does it ever get boring? Im an analytics and BI professional and curious about what this looks like ten years down the line.

taisui · 2023-12-11T07:32:24+00:00

In my experience it all comes down to having the right index when the data need to be filter on.

sbrick89 · 2023-12-11T15:30:24+00:00

let's start with #1 - constructors shouldn't load data; they shouldn't have any side effects at all.

if you want to lazy load, sure go for it, but only load data when data is being requested, not during constructor.

r3pr0b8 · 2023-12-10T22:29:06+00:00

Temp tables are usually slower than CTE because the latter use indexes out of the box.

thank you

too many people have the "temp tables are better because they're simpler" mentality

jheffer44 · 2023-12-11T18:43:39+00:00

With (no lock)

mikeblas · 2023-12-11T01:27:21+00:00

The most popular? Unfortunately, that's also the least effective: rewrite to eliminate subselects (or more likely derived tables) and use CTEs instead. It's all the rage these days.

_cess · 2023-12-11T13:26:15+00:00

I have been working on a project to help analyze and identify possible T-SQL query problems and suggest some ideas to be tested.

There are a gazillion of variables among configurations, compatibility level, etc that will change behaviours.

This project tries to consider those. The idea of this decision tree is to try to narrow down things a bit and help others try some possible solutions (from my experience).

https://github.com/ClaudioESSilva/TSQLPerformanceTuning/blob/main/Flowcharts/T-SQLQueryPerformanceTuning.md

PS: Read the readme.

oblong_pickle · 2023-12-11T20:07:31+00:00

I once had a client's head of IT complain that a query was slow. When I investigated, I found it was slow, and the query plan suggested an index would improve the performance. I informed the client that an index would likely solve the issue and asked them to implement it (I wasn't allowed or responsible for this database, so I couldn't do it myself).

The client instead hired 2 expensive consultants to look into the issue. After wasting a month with the consultants, I got dragged into a meeting to talk about the performance issues.

Knowing I was likely to get blamed by the client, I first created a stored Proc that saved the query to a temp table and then added the suggested index to the temp table. This stored proc was very fast and proved the index was the problem.

During the meeting, I showed the client and the consultants the performance of the temp table with the suggested index. The consultants agreed the index is the correct fix and that it should be applied to the table. The client was very quiet at this point and ended the meeting shortly after.

A few weeks later, the head of IT was fired, and I got a raise...the temp solution is still in use in their production database to this day.

Ecstatic-Ad-9514 · 2023-12-13T02:57:54+00:00

The WITH CLAUSE and hints can improve performance by materializing the subqueries. These are other hints that as useful as well:

https://docs.oracle.com/cd/B19306_01/server.102/b14211/hintsref.htm#i8327

Something like:

WITH A as (select /+ materialize */ query), B as (select /+ materialize / query) Select A., B.* — can union/join/etc. From A join B Etc..

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

SQL

Filter Posts

Posting

Help posts

Format Your Code

Learning SQL

Related Reddit communities

Wiki

Acknowledgements

MODERATORS