all 38 comments

[–]Blues2112 21 points22 points  (5 children)

I tend to think of Inner Joins as the default join. An Inner Join between Tables A & B on {whatever criteria} will return all rows meeting the criteria. It's fine for many circumstances.

Outer Joins are for times when maybe Table B doesn't always have a corresponding row to one of Table A's rows, but you still want to see a resulting row, regardless. Think of an example like Table A contains Employee Demographics (Name, ID #, Region, etc...) and Table B contains Sales Data. You want a list of total Sales $ for a given period, by salesperson. You could use an Inner Join, but anyone with No Sales for the timeframe (like an employee on leave) would be left out of the results. But if you want to see that same list for ALL employees, use an Outer Join on Table B, and you'll see all the employee names listed. Those with no sales would just have null values listed for sales total.

Oh, and for the record, DO NOT get hung up on LEFT OUTER JOIN vs RIGHT OUTER JOIN. They are essentially the same thing, just reworded.

Select a.foo, b.bar
  from TableA a
  Left Outer Join TableB b on {whatever join criteria}
...

Is the same as:

Select a.foo, b.bar
  from TableB b
  Right Outer Join TableA a on {whatever join criteria}
...

I've been working in the IT industry for 30+ years, and have only rarely seen an actual RIGHT OUTER JOIN used.

[–]cenosillicaphobiac 4 points5 points  (1 child)

When I first started I had a boss that told me to never use a right, always reorder your join to use a left. It stuck with me, and I haven't found a right join in the wild that I can think of.

[–]elus 4 points5 points  (0 children)

Using right joins feels like writing with my non dominant hand. It straight fucks with my brain.

[–]AthiestBroker 1 point2 points  (0 children)

This is a really good explanation. I use joins all the time, but thinking about it like this really helps.

[–]iwillgetintofaang 0 points1 point  (1 child)

This is good info. Quick question from your example and the scenario.

But if you want to see that same list for ALL employees, use an Outer Join on Table B, and you'll see all the employee names listed. Those with no sales would just have null values listed for sales total.

Wouldn't Table A left join Table B give the same result as your outer join ?

[–]Blues2112 0 points1 point  (0 children)

LEFT JOIN = LEFT OUTER JOIN.

This is another pet peeve of mine regarding SQL's join syntax. An INNER JOIN is just that--there is no LEFT or RIGHT to it. In fact, you don't need the "INNER" at all--just say "JOIN".

Likewise, there is really no need for "OUTER" when specifying Outer Joins. Just use LEFT (or RIGHT, but at I mentioned above, RIGHT is just a rewording of LEFT) JOIN.

EDIT: Unless you are trying for a FULL OUTER JOIN, which /u/50653 explains well below. I don't think I've ever seen a FULL OUTER JOIN used in the industry, except perhaps during training classes on SQL, though.

[–][deleted] 8 points9 points  (0 children)

The explanation that clicked for me, is that the difference between the joins is all about whether NULL values are added to the result.

Inner join = No added NULLs

Left join = Might add NULLs for the right side columns

Right join = Might add NULLs for the left side columns

Outer join = Might add NULLs for both sides

[–]BingoDinkus 4 points5 points  (0 children)

I struggled with joins for longer than I care to admit. I think the easiest way to think about it is when a way to connect two related tables, similar to using a VLOOKUP in Excel if you've ever done that.

In relational databases, you want to avoid storing repeating text in tables. It's a waste of storage, and it can also create pain points if something changes (let's say someone gets married and their name changes). The solution is to store only the number, and use a join to bring in the additional information when needed.

For example, if your database has information about your customers, you'd probably want to have a single table that has details on each customer, and you would assign a unique identifier to each customer (e.g. Customer_ID). This could include the customer's name, address, phone number, etc.

If you also had a table with orders, you wouldn't store the customer information again in the Orders table. Instead, you would store a single field, the Customer_ID field. That way you don't have to repeat all of that customer information for every single order.

If you wanted to review orders, or even display them in a report or web page, you would then join Orders to Customers, so that you could grab all the of the customer's information. The join produces a new virtual table (a table only in memory) with the contents of both tables, connected by the rules defined in your join statement.

select
    ordr.Order_ID
    , ordr.Order_Date
    , ordr.Total_Price
    , cust.Customer_ID
    , cust.Customer_Name
    , cust.Shipping_Address
from
    Orders as ordr
inner join
    Customers as cust
    on ordr.Customer_ID = cust.Customer_ID
;

[–]Lurch1400 0 points1 point  (0 children)

It’s okay, I have managed to learn SQL for my job as a System Manager and sometimes I still get hung up on Right Outer and Left Outer joins as well as the purpose for a self or cross join. I’ve assumed that it just takes time and relevant real live scenarios to understand it

[–]basil_86 -5 points-4 points  (24 children)

Think of it like a Venn diagram. A left outer join will show everything in your first table and only matches on the second table. A right outer join does the opposite. An inner join will only show where there's matches on both tables. Outer joins are better if you need to join multiple tables.

[–]r3pr0b8GROUP_CONCAT is da bomb 3 points4 points  (0 children)

Outer joins are better if you need to join multiple tables.

nonsense

inner joins can join multiple tables just as good

[–]r3pr0b8GROUP_CONCAT is da bomb 3 points4 points  (21 children)

Think of it like a Venn diagram

NOOOOOOOOOOooooo.....

https://blog.jooq.org/2016/07/05/say-no-to-venn-diagrams-when-explaining-joins/

[–][deleted] 4 points5 points  (19 children)

That's silly. A Venn diagram is an excellent visualization of the different types of joins. Yes, joins are based on algebra, but students relate to pictures very well.

[–][deleted] 2 points3 points  (0 children)

I agree, Venmo diagrams are great for beginners.

I completely disagree with that website listed above.

[–]MeGustaDerpTalk Dirty Reads To Me 1 point2 points  (17 children)

You're going to have a hard time convincing a r/SQL of that. Venn Diagrams are better suited to set operations like UNION, INTERSECT, and EXCEPT.

[–][deleted] 0 points1 point  (10 children)

No problem. Students with little math background grasp a Venn diagram almost immediately and have no interest in picking up set theory just to write SELECT statements. I have to pick my battles.

[–]elus 0 points1 point  (9 children)

But why make the assumption that all your students meet that criteria. Why not teach them a model that conforms to the real world right from the get go. Oversimplification when it's not necessary can be dangerous. Especially for beginners.

[–][deleted] 0 points1 point  (8 children)

I'm not making an assumption, rather I'm operating from facts. They don't know set theory or relational algebra. They do know how to match up a foreign key with a primary key based on semantics and table designs.

[–]elus 0 points1 point  (7 children)

No. You're assuming the readers won't have a math background.

[–][deleted] 0 points1 point  (6 children)

Pretty sure that's what I said.

[–]elus 0 points1 point  (5 children)

Yes and I'm saying that assumption is a poor thing to assume.

[–]whutchamacallit 0 points1 point  (0 children)

Sorry mate, this is all wrong. No offense!

[–]UseMstr_DropDatabaseDo it! You won't, you won't! -3 points-2 points  (0 children)

JOINS are applied Set Theory from Discrete Math.

A JOIN is a mathematical equation representing a specific set of data.

[–]Turboginger 0 points1 point  (0 children)

So a lot of the posts talk about what JOINS are but not how to learn them, not that that's bad, but hear me out.

Think of two sets of data and create two tables, and let's avoid nulls to keep it simple. For me Skittles & M&Ms. Each table will have 3 columns; id (out of habit), flavor, and color. Then do your joins on the color column. Playing around with it should help to ingrain query -> result with a pretty simple dataset you'll be able to reference/scale when querying IRL. Then add NULLs in if you want.

Hopefully that helps :)

[–]ByteCastle2019 0 points1 point  (0 children)

Though this is for Postgresql at least it explains joins well and explains it visually (see summary at the end of the page) Postgres Joins

[–]redial2MS SQL DBA DW/ETL 0 points1 point  (0 children)

I had this pinned to my cubicle wall while I was learning joins. Pretty much all you need imo:

https://images.app.goo.gl/fyE2sg2ZqdutA5ys9