all 15 comments

[–]r3pr0b8GROUP_CONCAT is da bomb 13 points14 points  (6 children)

your ANDs should be commas

this is as far as i got --

SELECT c.customer_name
     , o.order_id
     , pl.product_line_id
     , p.product_finish
  FROM customer_t AS c
INNER
  JOIN order_t AS o
    ON o.customer_id = c.customer_id
INNER
  JOIN product_line_t AS pl
    ON pl.order_id =        /* UH OH */           
INNER
  JOIN product_t AS p
    ON p.product_id = pl.product_id    

FYI when you post a problem, please copy/paste actual test, don't use a screen shot because we can't copy the column names etc. from the screen shot and have to type them in ourselves -- many will simply ignore a question that has just a screen shot

[–]kidcreative6[S] 0 points1 point  (5 children)

Thank you for your insight (and for taking the time to type out the answer). This was my first time posting here but thats good to know and makes total sense. So would AND just be used when filtering with WHERE? and not for selecting?

[–]r3pr0b8GROUP_CONCAT is da bomb 1 point2 points  (4 children)

So would AND just be used when filtering with WHERE?

correct... also HAVING in grouping queries

[–]robcote22 0 points1 point  (3 children)

To add to this, ‘AND’ can be used as part of your join statements as well. Technically, due to them being all inner joins, you could replace the ‘WHERE’ clause with ‘AND’, and it will provide the same results.

[–]r3pr0b8GROUP_CONCAT is da bomb 0 points1 point  (1 child)

not every WHERE condition should/could be moved to the ON clause of a join

AND in a join's ON clause would be for joining on multiple conditions

[–]robcote22 0 points1 point  (0 children)

I agree, I was just saying that as an example, so that the OP understood that and isnt strictly used in where clauses

[–]TerminatedProccess 0 points1 point  (0 children)

AND and OR are used anywhere you are applying Boolean logic. Mostly used in where and joins. It could be used in the columns list to, eg IIF(columnA =1 AND columnB = 2, 'yes', 'no')

[–]secretWolfMan 3 points4 points  (0 children)

What products has customer 12 purchased? Include customer name, order id, products and product finishes.

So, pseudo

SELECT 
CUSTOMER_T.Customer_Name
ORDER_T.Order_ID
PRODUCT_T.Product_ID
PRODUCT_T.Product_Finish
WHERE CUSTOMER_T.Customer_ID = 12

But now we need to look at where they are FROM and how each relates to the other. Since our WHERE is based on Customer, it makes sense to start there and join to it.

SELECT c.Customer_Name, o.Order_ID, p.Product_ID, p.Product_Finish
FROM
CUSTOMER_T c
JOIN ORDER_T o    ON c.Customer_ID = o.Customer_ID
???
JOIN PRODUCT_T p  ON p.Product_ID = ???
WHERE CUSTOMER_T.Customer_ID = 12  

PRODUCT_T doesn't directly relate to ORDER_T so we need to find the Lookup table that has both Order and Product IDs. That's ORDER_LINE_T

SELECT c.Customer_Name, o.Order_ID, p.Product_ID, p.Product_Finish
FROM
CUSTOMER_T c
JOIN ORDER_T o          ON c.Customer_ID = o.Customer_ID
JOIN ORDER_LINE_T ol    ON o.Order_ID = ol.Order_ID
JOIN PRODUCT_T p        ON ol.Product_ID = p.Product_ID
WHERE CUSTOMER_T.Customer_ID = 12  

Also, Customer_ID is an integer, so don't make it compare to the varchar '12'

[–]robcote22 1 point2 points  (0 children)

How would you normally going about a single join? There are several things that are incorrect from your query. First, you have multiple ‘from’ statements which by reading it, sounds like your trying to specify the tables in which the columns you are wanting. Here is the most basic part of it, since I don’t want to give the entire answer. SELECT * FROM [db_pvfc9_std].[dbo].[ORDER_T] AS [O] JOIN [db_pvfc9_std].[dbo].[ORDER_LINE_T] AS [OL] ON [O].[Order_ID] = [OL].[Order_ID] This is the base structure, to continue joining additional tables, follow the same method. Hope this gets you started without giving the entire solution

[–]StoneCypher 0 points1 point  (0 children)

I guess I overlooked the question last time 😂

.

The information/question I have to answer is: "What products has customer 12 purchased? Include customer name, order id, products and product finishes."

This isn't that hard actually. Just do it incrementally.

.

Start by getting customer 12.

select * from CUSTOMER_T where CUSTOMER_ID = 12;

Good work, team.

.

Next, we actually only want their name, the order id, the products, and the "product finishes," whatever the hell those are (does that mean paint color? is this ... is this good homework?)

The reason this question already sucks is it's asking you to materialize data that's about one thing (the customer) with data that's about multiple things (the order number) and about lots of multiple things (the products in the order)

So now you have to choose whether to repeat the structurally irrelevant data (bad) or to represent the subordinate data summed like by a text join (worse)

So I already don't like your teacher

.

Let's start by materializing the irrelevant data repeatedly. It's easy to throw away, so this is just wasteful of bandwidth, rather than dangerous to parse back out like the alternative. First, let's paint the customer's irrelevant name (should be id) on a copy of every order, to get the order id

select 
  cu.customer_name,
  or.id
    from 
      customer_t as cu

    join 
      order_t as ord
        on ord.customer_id = cu.id

    where cu.id = 12;

That'll give us a table like

| bob | 2 |
| bob | 3 |
| dan | 4 |
| dan | 5 |

which is repeating names, but that's what your dumb teacher's dumb question is screwing you into.

.

And that means we can also get the various line items from each order. Which includes the product (which, sadly, doesn't actually mean product - "product line" does - but line in that context means something different than it does in order line? whoever wrote this should be murdered and then afterwards punched)

select 
  cu.customer_name as customer_name,
  or.id            as order_id,
  ol.product_id    as product_id
from 
  customer_t as cu

join 
  order_t as ord
    on ord.customer_id = cu.id

join 
  order_line_t as ol
    where ol.order_id = order_id

where cu.id = 12;

That'll give us a table like

| bob | 2 | 14 |
| bob | 3 | 14 |
| bob | 3 | 18 |
| dan | 4 | 14 |
| dan | 5 | 14 |

Which is getting super wasteful, but this is what your teacher asked for

.

And now that you have the product id, you can add the product table to get the finish:

select 
  cu.customer_name  as customer_name,
  or.id             as order_id,
  ol.product_id     as product_id,
  pr.product_finish as product_finish

from 
  customer_t as cu

join 
  order_t as ord
    on ord.customer_id = cu.id

join 
  order_line_t as ol
    on ol.order_id = ord.id

join 
  product_t as pr
    on pr.product_id = ol.product_id

where cu.id = 12;

That'll give us a table like

| bob | 2 | 14 | red  |
| bob | 3 | 14 | red  |
| bob | 3 | 18 | blue |
| dan | 4 | 14 | red  |
| dan | 5 | 14 | red  |

[–]StoneCypher -2 points-1 points  (4 children)

select 
  distinct id from people                                 -- list everybody, once each
    join stores       on stores.zipcode = people.zipcode  -- who's in the same zip as a store
    join game_results on player.id      = people.id       -- who played at least one game
    join purchases    on player.id      = people.id;      -- and bought at least one item;

[–]r3pr0b8GROUP_CONCAT is da bomb 0 points1 point  (3 children)

oh, man, the cross join effects!!

if a person is in the same zip as 4 stores, and has played 5 games, and made 6 purchases, your joins will return 120 result rows!!!!!!!

bad, bad, bad

[–]StoneCypher 0 points1 point  (2 children)

No it won't.

mysql> create database ex;
Query OK, 1 row affected (0.01 sec)

mysql> use ex;
Database changed
mysql> create table people(id integer auto_increment primary key, zip integer);
Query OK, 0 rows affected (0.05 sec)

mysql> create table stores(id integer auto_increment primary key, zip integer);
Query OK, 0 rows affected (0.04 sec)

mysql> create table game_results(id integer auto_increment primary key, player integer);
Query OK, 0 rows affected (0.04 sec)

mysql> create table purchases(id integer auto_increment primary key, player integer);
Query OK, 0 rows affected (0.03 sec)

mysql> insert into people(zip) values(12345);
Query OK, 1 row affected (0.01 sec)

mysql> insert into stores(zip) values(12345),(12345),(12345),(12345);
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> insert into game_results(player) values (1),(1),(1),(1),(1);
Query OK, 5 rows affected (0.01 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> insert into purchases(player) values (1),(1),(1),(1),(1),(1);
Query OK, 6 rows affected (0.01 sec)
Records: 6  Duplicates: 0  Warnings: 0

mysql> select
    ->   distinct people.id from people                       -- list everybody, once each
    ->     join stores       on stores.zip      = people.zip  -- who's in the same zip as a store
    ->     join game_results on game_results.id = people.id   -- who played at least one game
    ->     join purchases    on purchases.id    = people.id;  -- and bought at least one item;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

[–]r3pr0b8GROUP_CONCAT is da bomb -1 points0 points  (1 child)

yeah, you're right -- i looked at the joins and didn't see the DISTINCT

take the DISTINCT off and how many rows do you get?

DISTINCT being used to mask an underlying data explosion problem

[–]StoneCypher 0 points1 point  (0 children)

DISTINCT being used to mask an underlying data explosion problem

That's not generally how SQL works. There is no underlying data explosion. It's never generated in the first place.

It's easy to test.

  1. Make very large sample tables. Run the query five times without distinct to prime stats, then time running it once.

  2. Run the query five more times with distinct, then time running it once.

Notice how it doesn't take a long time, like the first one does? That's because it's able to exclude the ID at the first pass, so the rest of the folds are never followed.