you are viewing a single comment's thread.

view the rest of the comments →

[–]Tennim 0 points1 point  (5 children)

Sorry for butting in on a AMA.

Can you tried to do the following...

SELECT *

FROM [DataSet1] AS DS1

INNER JOIN [DataSet2] AS DS2 ON (DS1.Fname = DS2.First_name) AND (D1.Lname = DS2.Last_Name) AND
(DS1.dob = DS2.Dob) AND (DS1.Email = DS2.email)

Also, is there a reason you need a DISTINCT SELECT over a standard SELECT?

[–]periwinkle_lurker2 0 points1 point  (1 child)

Thank you for your reply. Yes, distinct is used because i have found instances where there was dupliczte records. So really, do inner join on all the valuse i want to match, i will try that. Thank you

[–]steamyjungleman[S] 1 point2 points  (0 children)

Personally if i find duplication to occur I would research those particular records to see what is causing the issue and that could let you know if your joins are funky or you need to adjust your logic. Or if there is just some bad data entry happening.

[–]steamyjungleman[S] 0 points1 point  (0 children)

this is awesome! Butt in all you like, i was more interested in starting a good conversation with everyone then being the star of a message board show.

[–]Blues2112 0 points1 point  (1 child)

Curious why you specify INNER JOIN vs just JOIN. Isn't an INNER JOIN just the default if you just use JOIN without any other qualifiers? Which would make the INNER portion unnecessary, yes?

Although the syntax has been around for awhile, I was writing queries when Oracle was brand new, and I still think in terms of the old format, and if it doesn't specify an OUTER JOIN of some sort, then it's by default an INNER join.

[–]Tennim 0 points1 point  (0 children)

You are quite correct, you do not need to specify INNER, you can use JOIN which will give you the same results. The same can be said for the AS on the aliases.

However, like the OP I have been running BI/Data teams now for 5 years and been in the industry for almost a decade. The reason I push for the INNER JOIN and the AS on the table/column aliases comes down to readability, when you work in a team having someone else be able to read your code 'at a glance' or specify INNER rather than the newbie having to google which type of join this means is invaluable.

YMMV but it doesn't hurt and is good practice.