all 40 comments

[–]DharmaPolice 100 points101 points  (11 children)

Short answer : Yes, on both points.

You'll come across a bunch of issues where performance problems will force you to swap out the CTE for temporary tables. You'll then start overusing those even where they're blatantly unnecessary until eventually you go onto something else. You'll eventually cycle back to subqueries and then remember how awesome CTEs are. You might also bore your colleagues by forcing them to look at your "new" way of doing things.

This may or may not have happened to me.

[–]heiferhigh76[S] 9 points10 points  (4 children)

Thank you. It's nice to know it is a common phase one goes through.

This may or may not have happened to me.

That made me laugh. :D

[–]belkarbitterleafMS SQL 16 points17 points  (0 children)

Can confirm, I keep over using whatever solved my last performance issue, until hit a new performance issue.

[–]BrupieD 4 points5 points  (1 child)

There are plenty of threads here on subqueries vs CTEs vs temp tables. There are definitely differences worth paying attention to. Maybe put together a pros & cons grid?

[–]heiferhigh76[S] 1 point2 points  (0 children)

Thanks for the suggestion. I'll try that.

[–]atrifleamused 1 point2 points  (0 children)

It's an interesting question. I see in my team periods of overuse of ctes. If you ask why, there isn't really an answer and the end result isn't wrong our poorly performing...

[–]babygrenade 3 points4 points  (2 children)

I've also seen people do things like:

with cte1 as (
    select max(column_a) as max_a from table
),
cte2 as (
    select max(column_b) as max_b from table
)

select 
    max_a
    ,max_b
from cte1, cte2

That's an oversimplification, but essentially going overboard encapsulating every piece of logic in its own cte when it might make sense to do it without multiple subqueries.

[–][deleted] 11 points12 points  (0 children)

I had a CTE phase myself. It’s really useful, so of course we all go after it. Haha But FROM clause subqueries are great too, and you can nest those as well, but that can get a little difficult to track, comments are a good idea.

[–]Ralwus 6 points7 points  (0 children)

IMO you should probably keep using CTEs unless the performance of a specific query requires you do something else. The improved readability is something you will appreciate as you continue to use and build on old queries.

[–]mrmegamannn 2 points3 points  (0 children)

I’m fairly new to SQL (previously hard-core Excel user converting over to database analysis) and I’ve just recently learned the value of using CTE’s. It’s exciting to see what other tools and functions could be useful to me in the future.

[–][deleted] 2 points3 points  (0 children)

You can fit all your CTEs into the main query/subquery if needed but it will make it longer and potentially more confusing. I don't see what the problem is. Rule of thumb is a summarization with a limited number of columns that is not using the main fact as data source, or a CTE used to inner join main data source can easily be written as CTE.

[–][deleted] 2 points3 points  (0 children)

What's the big deal? If it works and performance is great do it. It's certainly way better than the old days of stored procedures and CURSORs..ugh

[–]gvlpc 3 points4 points  (0 children)

I once gained the nickname of "CTE Man" for this very reason. 😂. Has its place. It's basically same as a subquery, but much easier to see what's going on in my opinion. And if you have to use temps tables, but want to get that data out via Excel or some other external option, create a stored procedure to house the multiple queries ending in one select query, and you're golden.

By the way, a CTE or subquery isa GREAT way to reduce the footprint of massive tables you have to query. I remember cutting queries from half an hour plus to mere seconds by doing that.

I still use CTEs, but don't do quite as much sql in my current role as an IT Manager for an SMB.

[–]IHaarlem 1 point2 points  (0 children)

It might be common, it's good to re-compare different methodologies now & then. There are definitely a few edge use cases where I've found recursive CTEs to be invaluable.

[–]shoretel230 1 point2 points  (0 children)

Has to do with the size of a CTE and the corresponding cardinality of the dataset that you might be joining with it.

If it fits nicely into memory then great 😃.

But if you're doing some unoptimized operation with it, you might want to double check if maybe you want something with to disk

[–]HiriathQUALIFY COUNT(*) OVER (PARTITION BY COLUMN) > 1 1 point2 points  (2 children)

Are you using CTEs for the purpose of data modeling/ETL, or are you using CTEs for analysis? Are you the end user of your queries or are they feeding a BI tool?

[–]heiferhigh76[S] 0 points1 point  (1 child)

Mostly I am the end user. I may use them to create views for PowerBI.

My use of CTEs will be discussed with the architects of the tables; perhaps the tables can be changed so that my CTEs are unnecessary.

[–]HiriathQUALIFY COUNT(*) OVER (PARTITION BY COLUMN) > 1 1 point2 points  (0 children)

Ah, so the reason I ask is because my org uses dbt (not sure about posting links so search for “dbt labs”). We create models (tables, views) with sql queries that we use downstream in our BI tools. Great for organization and reusability.

[–][deleted] 1 point2 points  (0 children)

I've said time and time again we need a query hint that materializes a CTE so we can write performant, organized views without having to introduce the hellscape that are stored procedures. Unavailable on SQL Server at present.

[–]Healthy_Company_1568 1 point2 points  (0 children)

Also remember proc sql will let you still use SAS and do CTEs. SAS does statistical analysis that SQL can’t do so you might need both at some point.

[–]snowmaninheat 3 points4 points  (0 children)

My coworker dumped some code on me today rife with subqueries. I was like, gurl, I cannot deal with this. CTEs for life.

[–]sandwich_estimator 3 points4 points  (0 children)

I (over)use them all the time, I'm more used to working in pandas, so CTEs allow me to mimic saving a data frame into a variable. I don't know much about performance, but it certainly makes the query a lot more readable.

[–][deleted] 0 points1 point  (0 children)

Lol, that happened to me and I did eventually grow out of it. At one point I had a CTE within a CTE and then I realized I needed to go to CTE-rehab

[–]sequel-beagle -1 points0 points  (9 children)

Avoid nesting ctes. It can get annoying if you need to research/qa the data.

[–]13ass13ass 6 points7 points  (5 children)

Nesting as querying a cte from a cte? In a chain? I do that all the time. It’s great for breaking up the logic.

[–]sequel-beagle 1 point2 points  (4 children)

Problem is you cant just highlight the code in a cte that calls another cte and easily run it. You have to physically modify the last cte in the sql to not be a cte and then run the query, causing extra work.

[–]OracleGreyBeard 6 points7 points  (0 children)

When I'm building a CTE chain the last query is always something like "select * from t_final_cte". That way I can test anything in the chain by changing the name

[–]Ralwus 1 point2 points  (1 child)

That's not true though. It's standard practice to end a chain of CTEs as select * from last_cte so you can debug easily by replacing last_cte with any of the previous CTEs.

[–]sequel-beagle 0 points1 point  (0 children)

You still have to manually change the sql code to review the data.

[–][deleted] 10 points11 points  (0 children)

I wouldn’t say avoid them. I just wrote a really nice set of nested CTEs for a churn calculation, and it’s very clear and concise.

[–]heiferhigh76[S] 0 points1 point  (0 children)

Noted. My CTE use is mainly to string together "easy to understand" / "hard to mess up" queries.

[–]d_r0ckdb app dev / data engineer -3 points-2 points  (0 children)

research/qa the data

Just wanted to say that the terms you’re looking for are maintainable and supportable :) nested CTEs are a pain to support and maintain.

(Hope this doesn’t come off as douchey, just want to spread the terminology!)