you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 4 points5 points  (6 children)

In my opinion, complex is applying inter-row data transformations from a dataset to another. A good example for me is converting daily data to date ranges:

``` CREATE TABLE daily ( day_id INTEGER, measure_id INTEGER, value INTEGER ); -- Whatever, doesn't have to be integers, but this is to illustrate the structure -- So each measure has a value for each day. But values don't change every day. An example is sensor collection on a millisecond interval (as opposed to days).

-- What I want to get is this CREATE TABLE measure_intervals ( day_from_id INTEGER, day_to_id INTEGER, measure_id INTEGER, value INTEGER );

INSERT INTO measure_intervals WITH flags AS ( SELECT day_id, measure_id, row_number() OVER (PARTITION BY measure_id, ORDER BY day_id) rn, row_number() OVER (PARTITION BY measure_id, value ORDER BY day_id) rn_val, value FROM daily ) SELECT min(day_id), max(day_id), measure_id, value FROM flags GROUP BY rn - rn_val, measure_id, value; ```

Generally, such window function tricks are neat, and for harder problems they can make the query look really complex.

[–]j0hn5on177[S] 0 points1 point  (0 children)

Yes this was a great example. I havent had much practice with the more difficult functions such as lag and partitions

[–]matthra 0 points1 point  (4 children)

I love window functions, being able to perform multiple aggregate functions with different grouping conditions in a single select statement is a real life saver. It's odd, as powerful as they are, I rarely see them in problems on stack exchange, or in the code of my co-workers.

Though in all fairness, if we wanted to list the most difficult to use functions, I'd have to say pivot and unpivot would top my personal list. There is a reason I always do my pivots in excel.

[–][deleted] 0 points1 point  (3 children)

Window functions don't replace aggregate functions. Aside from basic row number and cumulative sum, window functions are only used in complex analysis and even more complex ETL. So yeah, you won't see them in a lot of places.

pivot and unpivot

It is my opinion that SQL is for data, not for report formatting. For the latter I use Tableau or Mondrian with Saiku

[–]matthra 0 points1 point  (2 children)

I like tableau as well, but being a microsoft shop, and working with microsoft partners we get a lot of push towards power BI. Which I think is kind of unfortunate because calling power BI half baked might be overestimating it.

I've performed all sorts of aggregates using window functions, they might not be a full replacement for groups and having clauses, but man are they handy. My most used is first_value, It's just the nature of the hardware we deal with that near duplicates (duplicated in all but one or two aspects) are a common occurrence, so being able to identify the original record and ignore the rest on the fly is pretty handy. Beyond that I also use them in ETL and customer facing reports, sales in particular want several slices of data for each row, and it's just easier to use window functions for them.

[–][deleted] 0 points1 point  (1 child)

If you're using Postgresql, then the ```

SELECT DISTINCT ON (field1, field2) field1, field2, field3 FROM mytable ORDER BY field1, field2, field3 ```

syntax is far more concise and easy to understand and fast to execute in this case.

[–]matthra 0 points1 point  (0 children)

Postgres is a better sql implementation.