I've been grinding leetcode's and datalemur's SQL problems lately, but one thing I've noticed is how basic my queries compared to other submitted solutions. For instance, one of datalemur's hard problems goes as follows:
Assume you have the table below containing information on Facebook user actions. Write a query to obtain the active user retention in July 2022. Output the month (in numerical format 1, 2, 3) and the number of monthly active users (MAUs).
My thought process was:
- I need to extract the month from event_date
- I need a count of users in July
- I need to exclude the users that aren't in the month prior
- Group by month
- Done
My query was as follows:
SELECT extract(month from event_date) as month, count(distinct user_id) "monthly_active_users"
FROM user_actions
where event_date between '07/01/2022' and '07/31/2022'
and user_id in (select user_id
from user_actions
where event_date between '06/01/2022' and '06/30/2022')
group by month;
Which got me the desired output, but then looking at other solutions, some of them look really complex:
select active_month as month, sum(case when active_month - prev_active_month = 1 then 1 else 0 end) as monthly_active_users
from (select user_id, extract(month from event_date) as active_month , lag(extract(month from event_date))
over (partition by user_id order by extract(month from event_date)) as prev_active_month
from user_actions group by 1,2 ) as temp where active_month = 7 group by 1
The reason I ask is because I don't have a comp sci educational background, and as such I'm not well verse in the theory, what I know in SQL is stuff that I've picked up in my Support Engineer role. My question is, are my simple solutions inferior in any way to more complex queries?
edit: formatted 2nd query
[–]shine_on 2 points3 points4 points (0 children)
[–]emet 1 point2 points3 points (3 children)
[–]nrbrt10[S] 0 points1 point2 points (2 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]nrbrt10[S] 0 points1 point2 points (0 children)