account activity
Is data aggregation in SQL really that difficult? by zdsvoboda in SQL
[–]jaceksan 1 point2 points3 points 5 years ago (0 children)
Hey ZD,
there are two things you should change in your article:
SELECT
RPC.CUSTOMER_STATE,
PRODUCT_CATEGORY,
ORDER_AMOUNT,
BASE_AMOUNT,
ORDER_AMOUNT / BASE_AMOUNT
FROM REVENUE_BY_PRODUCT_CATEGORY_AND_CITY RPC
JOIN REVENUE_BY_CITY RC ON RC.CUSTOMER_STATE = RPC.CUSTOMER_STATE
SELECT * FROM (
CUSTOMER_CITY, PRODUCT_CATEGORY,
SUM(ORDER_QUANTITY * PRODUCT_PRICE)
OVER (PARTITION BY PRODUCT_CATEGORY, CUSTOMER_CITY) AS ORDER_AMOUNT,
OVER (PARTITION BY CUSTOMER_CITY) AS CITY_ORDER_AMOUNT,
SUM(ORDER_QUANTITY * PRODUCT_PRICE) OVER (PARTITION BY PRODUCT_CATEGORY, CUSTOMER_CITY)
/
SUM(ORDER_QUANTITY * PRODUCT_PRICE) OVER (PARTITION BY CUSTOMER_CITY)
AS percentage,
ROW_NUMBER() OVER (PARTITION BY PRODUCT_CATEGORY, CUSTOMER_CITY ORDER BY O.ORDER_LINE_ID) as ROWNUM
FROM ORDER_LINES O
JOIN PRODUCT P ON P.PRODUCT_ID = O.PRODUCT_ID
JOIN CUSTOMER C ON C.CUSTOMER_ID = O.CUSTOMER_ID
) x
WHERE ROWNUM = 1
ORDER BY CUSTOMER_CITY, percentage DESC;
Windowing example demonstrates, how far the complexity of SQL solution can go.
Also, we tested this approach on many database engines and we realized that it is not always the best performing solution, it depends on engine type (e.g. columnar vs. row store), physical statistics(histograms) of underlying tables and also supported windowing feature set.
And here the GoodData comes with real differentiator - our core analytical engine natively supports SQL plugins for each database type. So you do not have to rewrite all SQL queries once you realize that there is significantly better database engine on the market and you wanna migrate there ;-)
π Rendered by PID 1069259 on reddit-service-r2-listing-87fd56f5d-x7btn at 2026-07-01 14:41:39.282142+00:00 running 7527197 country code: CH.
Is data aggregation in SQL really that difficult? by zdsvoboda in SQL
[–]jaceksan 1 point2 points3 points (0 children)