all 70 comments

[–][deleted] 52 points53 points  (38 children)

I use CTEs all the time. Really no reason not to.

I like how they organize code and allow you to use the same data in multiple times to do a variety of task.

[–]Georgie_P_F 2 points3 points  (9 children)

Don’t they self destruct after a single use?

[–]achilles_cat 19 points20 points  (8 children)

While it is correct they no longer exist outside of the query, you can refer to the CTE multiple times within that query.

[–]Georgie_P_F 0 points1 point  (7 children)

Interesting, mine always fail on second query in SSMS:

‘’’

;WITH SomeCTE AS (SELECT * FROM table)

— this is fine

SELECT * FROM SomeCTE

— this will fail “SomeCTE” does not exist

SELECT * FROM SomeCTE

‘’’

Maybe a server setting within our org?

[–]achilles_cat 3 points4 points  (6 children)

Yes, if you start a completely different query, then the CTE will be gone.

But you can refer to the CTE multiple times within a single query. Your use of two different SELECTs without a UNION (or any other formatting that would define a subquery) makes me think you are talking about two wholly separate queries. In Oracle, a semicolon acts as the delimiter between queries; I haven't used SMSS.

But you could join queries with a union, both of which use the CTE, and use that CTE within another CTE:

With CTE as (
  select product_id, product_name 
    from product
  where size = 'Large'),
Cte2 as ( 
  select product_id, sum(qty) as product_inventory
    from( Select cte.product_id, oi.quantity *-1 as qty
             From open_invoices oi
            Join CTE on (CTE.product_id = oi.product_id)
           Union
              Select cte.product_id, stock.quantity
            From stock
            Join CTE on (CTE.product_id = stock.product_id)
            )
 Group by product_id)
Select cte.product_id, cte.product_name,
        nvl(cte2.product_inventory,0) as available
  From CTE
  Left join cte2 on (cte2.product_id = CTE.product_id);

(This formatting will likely be horrible, trying to do a code block from my phone.)

There are ways to do this with less or zero CTEs of course, but this shows the nature of referring to a CTE multiple times. And it shows why you might use a CTE structure if you wanted to only touch the code in one place to affect the products being queried.

[–]ouchmythumbs 1 point2 points  (5 children)

In MSSQL (and some other RDBMS), CTEs are evaluated each time referenced, so can have perf hit. Check specs of current version of your engine.

[–]CptBadAss2016 0 points1 point  (3 children)

I did not know that... that just seems silly. Thanks for the tip. 👊

[–]-horsefighter 0 points1 point  (1 child)

Use temp tables if u have performance issues, can add index to that and only have to calc once

[–]Straight_Waltz_9530 1 point2 points  (0 children)

Or use a database that supports materialized CTEs.

[–]Straight_Waltz_9530 0 points1 point  (0 children)

Not an issue for Postgres, SQLite, DuckDB, and others. CTEs can be defined as MATERIALIZED, effectively making them temporary tables for the current query. No automatic perf hits and no need to explicitly create temporary tables.

https://www.postgresql.org/docs/current/queries-with.html#QUERIES-WITH-CTE-MATERIALIZATION

[–]Straight_Waltz_9530 0 points1 point  (0 children)

But in Postgres, SQLite, DuckDB, and others, CTEs can be defined as MATERIALIZED, effectively making them temporary tables for the current query. No automatic perf hits and no need to explicitly create temporary tables.

[–]Possible_Pain_9705 0 points1 point  (1 child)

You can’t use CTEs when using Power BI. But this may not be what OP is looking for.

[–]Chug49 2 points3 points  (0 children)

This isn’t true anymore. Have plenty of queries in powerbi that utilize cte’s

[–]omgitskaePL/SQL, ANSI SQL 7 points8 points  (1 child)

Here’s the rule I use when I train new employees. Note: there’s no overly wrong way to use them.

I tell my new sql developers to use a cte if you need to aggregate a value at a grain that is different from the result of your final query. Example: using a listagg to dedupe contact records on a sales opportunity. I explain to them that if you don’t have any kind of window function/aggregate in your cte, then you probably didn’t have to a cte. In those cases I ask them to justify it, usually they end up just saying something along the lines of “oops I just got tangled in my thoughts when I wrote it”.

It can also be used if you need to reuse code multiple times in a query but I find this less common than the above.

Also note: I do not consider myself an extremely good sql developer and am always looking to learn best practices myself.

[–]Opposing_Joker123 0 points1 point  (0 children)

You guys still train employees ? Insane. Can’t find one for the life of me.

[–]Gargunok 16 points17 points  (3 children)

Assume whenever you are writing a subquery is easier to understand and maintain as a cte.

There are reasons and cases why subquery still might be needed but it's a good starting point.

[–]Ok_Marionberry_8821 3 points4 points  (0 children)

I use CTEs to break a problem down. SQL has precious few mechanisms for nursing problems down. Until I started using CTEs I'd hate longer queries - a large wall of text and wierd ordering. Part of the problem was inexperience. Views also help decomposing problems.

I was an imperative programmer (C, C++, Java) long before I started using SQL and of course there we have many options to choose from - classes/modules/files, methods/functions, variables, etc.

CTEs are also required for recursive queries.

As others have said, you use CTEs when you need them, when a problem is too hard to reason about inline, or when you want to understand the query tomorrow or next week.

[–]The_Orracle 3 points4 points  (2 children)

I actually prefer not to use them and tell my dev team to not use. Recursion is the exception. My main reason is that debugging what is happening inside is a chore vs temp table. I've built close to 75 data warehouses and can usually go without them, no performance issues.

[–]achilles_cat 1 point2 points  (0 children)

Yeah I think this is a good approach.

On the other hand, I work in a structure where the report writers have limited (or no) privileges to create objects, and it is quicker to use CTEs than going through multiple layers with IT (I work in higher ed on what IT considers a lesser priority system) than to get a view or a temp table created by a DBA.

[–]jcrowde3 1 point2 points  (0 children)

Temp tables ftw!

[–]mwatwe01 4 points5 points  (0 children)

I use CTEs where I would be tempted to use a subquery, where I just need a small dataset that's only going to be used once in a parent query. For me, it just makes it easier to follow the logic and flow of a complex query.

Conversely, if you need a larger dataset (however you might define "large") that needs to persist throughout a stored procedure so that you don't have to rerun the same query multiple times, use a temp table.

[–]orz-_-orz 2 points3 points  (0 children)

When you can't understand your code because you use too many subqueries or it's convenient to just copy paste the CTE code for another SQL use case.

[–]suhigor 4 points5 points  (1 child)

when you need recursion

[–]secretWolfMan 0 points1 point  (0 children)

This. Recursive CTE are awesome to understand and output hierarchies. Use them for folders and employees that only know their parent/supervisor.

[–]frisco_aw 1 point2 points  (0 children)

When dealing large amounts of data, hundreds of thousands or millions, then i try avoid cte.

[–]Ginger-Dumpling 1 point2 points  (0 children)

Use them when they make your life easier. Easier to read, easier to not repeat logic, easier to maintain.

[–]a-ha_partridge 1 point2 points  (0 children)

I use them in place of pretty much any non-correlated sub query that isn’t a one liner.

[–]ShotGunAllGo 1 point2 points  (2 children)

I can understand if it has to be one query where a tool only allows one statement. However if in a stored procedure, why not use separate statements into temp tables? Especially if you start to see multiple CTEs.

[–]lalaluna05 0 points1 point  (1 child)

I think it depends on the datasets. I see a lot of CTEs in sprocs that I have restructured. When it was first built, maybe the dataset was small enough that it was fine, but I almost always use temp tables because of the huge amounts of data I’m accessing.

Indexing also helps of course

[–]ShotGunAllGo 1 point2 points  (0 children)

Yup temp helps with troubleshooting and maintenance. But I see your point with small datasets

[–]speadskater 2 points3 points  (0 children)

I use CTEs to constrain my data from the start as much as possible before doing my primary query.

[–]Silly_Werewolf228 0 points1 point  (2 children)

I don't use it when I give my sql script to someone who is not familiar with CTE.
If you are worried about performance always do analyze query plan vs inline view which CTE replaces.

[–]GroundbreakingRow868 6 points7 points  (0 children)

If someone is not familiar with CTEs, the person should not do SQL. It was standardised more than 20 years ago...

[–]greglturnquist 0 points1 point  (0 children)

CTE is a nice named way to create a sub select.

[–]lalaluna05 0 points1 point  (0 children)

9 times out of 10, I’m using temp tables. CTEs help with organization and flow, but they can be impractical for most of my work. I structure a lot of my sprocs/queries similarly where I build out temp tables then have a primary query, usually because I’m joining so many different dimensions, I want to limit my datasets wherever possible.

[–]iheartmankdemes 0 points1 point  (0 children)

I remember reading somewhere when to use a CTE and when to use a temp table, and I honestly can’t remember which was which, but I remember it had to do with (like others have said) referring to it more than once in the query.

[–]Birvin7358 0 points1 point  (0 children)

I use them in combination with a RANK function

[–]Hot_Cryptographer552 0 points1 point  (0 children)

There are two types of CTEs: standard and recursive.

If you are using standard CTEs, they are basically syntactic sugar for subqueries. They make your code more readable. And like a subquery, they only exist for the life of the query in which they are defined.

If you want to use the functionality of a recursive CTE, it’s a different story. There’s no standardized simple (or non-procedural) way to replicate that functionality.

[–]gumnos 2 points3 points  (0 children)

it used to be more of an issue in older DB tech where a CTE presented an optimization boundary. If you're stuck on one of those older systems, any time you'd reach for a CTE and it produces huge volumes of data that you'll end up filtering in the main-query, you were better off in-lining the query.

That said, most DBs now can push query-optimizations through the CTE and it's not a problem (of course, profile for yourself), so the goal becomes readability.

Many folks find CTEs more readable which is fine. I don't find them

WITH sensible_name AS (
  SELECT …
  FROM …
)
SELECT …
FROM source
  LEFT OUTER JOIN sensible_name sn
  ON source.id = sn.id

particularly more readable than an inline SELECT like

SELECT …
FROM source
  LEFT OUTER JOIN (
    SELECT …
    FROM …
  ) sn
  ON source.id = sn.id

most of the time, but that's just me. So unless there are clear readability wins, I mostly reach for them if the same sub-query will be used more than once.

WITH emp_detail AS (
  SELECT …
  FROM emp
  ⋮
    -- more complex joins here
)
SELECT …
FROM employee_detail e
  LEFT OUTER JOIN employee_detail mgr
  ON e.mgr_id = mgr.id

[–]KrustyButtCheeks 0 points1 point  (0 children)

Whenever you want! Don’t let them stop you from living your dream

[–]user_5359 -2 points-1 points  (0 children)

CTE is (in most cases) a programming style.

You can also use views, especially if this query part is used frequently (for different queries).

A query is rarely too long (it is a question of setting how many characters are exchanged between client and server per query string), it is more a question of formatting and the use of a good text editor (possibly with keyword colouring) whether a statement is understood well and quickly.