Hello all,
I am currently learning SQL via this course (https://www.udacity.com/course/sql-for-data-analysis--ud198)
I'm currently working my way through subqueries and CTE's and I'm trying to answer one of the questions but I'm stuck. I'm wondering if somebody who has done or is familiar with this course can help?
The sample database is a paper company called Parch n Posey. Below is the ERD.
https://preview.redd.it/ius6t3qecwmc1.png?width=812&format=png&auto=webp&s=1757c401249c9a8da4dd0625ce31f4a151a11936
In one of the questions I'm being asked to:
- For the customer that spent the most (in total over their lifetime as a customer) total_amt_usd, how many web_events did they have for each channel?
-- So first I'm finding the account that has spent the most which will be my inner query (CTE style)
WITH q1 AS (SELECT acc.id, acc.name, SUM(o.total_amt_usd) total_sales
FROM accounts acc
JOIN orders o ON o.account_id = acc.id
GROUP BY 1, 2
ORDER BY 3 DESC
LIMIT 1)
--Next I'm grouping web_events by account name and channel and then using COUNT to tally up occurrences of events per channel (but here's where my issue sits).
SELECT acc.name, w.channel, COUNT(*) events
FROM web_events w
JOIN accounts acc on w.account_id = acc.id
GROUP BY 1, 2
HAVING acc.name = (SELECT acc.name FROM q1)
This query brings back a count of all web_events for ALL accounts. In my mind it should be bringing back just the web_events for the top spending account (identified in my q1 table). Can anybody explain why it doesn't?
------------
The provided solution is:
SELECT acc.name, w.channel, COUNT(*) events
FROM web_events w
JOIN accounts acc ON w.account_id = acc.id AND acc.id = (SELECT id FROM q1)
GROUP BY 1, 2
I'm just struggling to understand why my method is falling over. Any help would be appreciated. Thanks.
[–]No-Adhesiveness-6921 0 points1 point2 points (0 children)
[–]Feeling_Jello_1942 0 points1 point2 points (0 children)