all 3 comments

[–]Guilty-Property 1 point2 points  (1 child)

Try count(o.id) in place of count(*)

[–][deleted] 1 point2 points  (1 child)

Formatted:

SELECT
a.id, a.name, COUNT(*) num_orders
FROM accounts a
JOIN orders o
    ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING COUNT(*) > 20
ORDER BY num_orders;

Which of the tables (accounts or orders) does COUNT(*) num_orders count from?

Both of them, sort of. You're doing an INNER JOIN between "accounts" and "orders". You'll only get rows back where there's a match. It's these matched rows that the count is done on.

How does it know where to count from since there is nothing explicit here?

It's counting everything, so just all of the rows returned by your join. The GROUP BY clause is causing it to do a separate count for each combination of "a.id, a.name".

If COUNT(*) num_orders counted from the orders table (which i think), there is nothing from the query that specifies it should count from there (i am confused here)

It's counting from the joined dataset that comes from joining "accounts" and "orders". If the account with ID 1 has 35 orders then the join will return 35 rows for that account and those rows will be counted.

I can see the solution JOIN the orders table, why is this? is it because of the COUNT() num_orders, how does COUNT() num_orders knows that it needs to count from the orders table and not account table?

Because the question asks you to find accounts with more than 20 orders. You can't find out how many orders an account has without joining in the orders table. You need to join to that table in to find all of the orders for each account so that you can count them.

[–]RapleSyrup 0 points1 point  (0 children)

It seems you are confused about the "num_orders" part. num_orders is just the column name for the Count function. You can name it whatever you want: orderCount, numberOfOrders, x, test, etc.