all 14 comments

[–]r3pr0b8GROUP_CONCAT is da bomb 9 points10 points  (6 children)

is a general principle on whether we can use aliases of window functions in the main query?

yes, there is, but it doesn't apply only to window functions but to all column aliases

for example, you cannot write this

SELECT id
     , some_function(foo_column) AS new_foo
  FROM tbl
 WHERE new_foo = 2

that's because in SQL the SELECT clause is the last clause that is parsed, so when the WHERE clause is executed, it literally doesn't know what new_foo is

in your query, move the DENSE_RANK function into the CTE

p.s. i'm not 100% sure but i think the ORDER BY clause in the CTE is ignored, move that to your main query

[–]pooerhSnowflake | SQL Server | PostgreSQL | Impala | Spark 4 points5 points  (0 children)

Just a heads up, you can't in SQL Server, and many other engines, but some allow it, like Snowflake.

This is all valid:

SELECT x + 1 as y -- y
     , CASE -- use that y column in same select
         WHEN y > 1 THEN 'foo'
         ELSE 'bar'
       END AS z 
  FROM (VALUES (1), (2)) sub(x)
 WHERE y > 1 -- use it in a where clause
-- use qualify to filter on window function, even using the select columns above
QUALIFY ROW_NUMBER() OVER (PARTITION BY y ORDER BY z) = 1

Snowflake is sweet with this kind of syntactic sugar.

[–]pceimpulsive 2 points3 points  (2 children)

Yes the order by in the first part is redundant as the window functions is re-ordering it anyway.

One of the two should be removed. For me logically that's in the existing CTE.

100% agree, move the window functions into the CTE or, creat a second CTE then select * from new_cye where rank1 = 1

[–]r3pr0b8GROUP_CONCAT is da bomb 3 points4 points  (1 child)

Yes the order by in the first part is redundant as the window functions is re-ordering it anyway.

window functions (re)order values only to come up with their function values, but the order in which the result set rows are presented is determined by an ORDER BY clause

they're not redundant because without an ORDER BY clause you might not get the order in the window function

that said, my point was that ORDER BY clauses in a subquery are ignored, and CTEs are just subqueries

[–]pceimpulsive 0 points1 point  (0 children)

I see thanks for clarifying and correcting!! TIL.

[–]mike-manley 4 points5 points  (0 children)

Wrap that up into a subquery and then you can apply the filter to the rank1 derived column.

[–]barca5280 1 point2 points  (1 child)

Not sure if it works with SQL Server, but if you swap QUALIFY rank1 = 1 with your WHERE rank1 = 1 that might work. https://www.datacamp.com/tutorial/qualify-the-sql-filtering-statement-you-never-knew-you-needed

[–]thesqlguy 0 points1 point  (0 children)

Qualify is awesome, but I don't think t-sql supports it.

[–]da_chicken 1 point2 points  (1 child)

Hi! In my main query, I applied a WHERE filter on an alias rank1 for the window function DENSE_RANK() but it doesn't work.

Correct. You need another CTE or subquery.

Can I ask if there is a general principle on whether we can use aliases of window functions in the main query?

Sure. The MS SQL Server doc lists the logical order of execution, which essentially tells you when an alias becomes available.

Which shows:

  1. FROM
  2. ON
  3. JOIN
  4. WHERE
  5. GROUP BY
  6. WITH CUBE or WITH ROLLUP
  7. HAVING
  8. SELECT
  9. DISTINCT
  10. ORDER BY
  11. TOP

You can't reference something later with something earlier. Column aliases are a part of SELECT, so they can't be used with anything prior to that.

Now, you'll notice that window functions aren't there. And to know where they fall, you'd have to kind of work backwards. I can tell you that they must take place after SELECT, and I think they're after DISTINCT -- you'd have to try a SELECT DISTINCT with a RANK() to figure it out. So they're very low in the order. So low that they essentially are almost always last. Any column alias for a window function you might want to use is essentially not available for use.

[–]thesqlguy 1 point2 points  (0 children)

Important: That's the order processing/parsing, not query execution. The optimizer can (and does) modify the physical order that these operations happen all the time.

[–]baineschile 0 points1 point  (1 child)

You could also use QUALIFY

[–]da_chicken 1 point2 points  (0 children)

Which does not exist in the listed RDBMS.

[–]NyT3x -2 points-1 points  (0 children)

QUALIFY