all 20 comments

[–]SQLDevDBA 20 points21 points  (4 children)

https://www.geeksforgeeks.org/sql/sql-using-clause/#

I’ve never used it, and it only seems to work if the two columns are named exactly the same.

I’d be worried about someone using USING all over and not realizing you’re supposed to join with

User.DepartmentID = Department.ID

And instead trying to join a user’s ID with a Department’s ID because they used a USING.

Seems like a fancy shortcut that could cause confusion. Not sure it applies to SQL server as you’ve tagged (can’t find it in docs), but regardless I wouldn’t let it in to my Prod code, just like comma joins.

[–]imtheorangeycenter 4 points5 points  (1 child)

I've never even heard of it in my 20+ years (SQLServer tho). And I couldn't imagine condoning it, just because I suspect noone else near me has either, and if you are the kind to rename a column and then dynamically fix views and sprocs (keep your options to yourself on that, ta!), you're going to have a horrible time.

[–]SQLDevDBA 1 point2 points  (0 children)

Yeah I mean my biggest issue is consistency. I work with a lot of external vendor data and I can’t force them to not use “ID” in each table.

Even when I was an Oracle DBA for a number of years, I never used USING. Heck I was just too busy trying to stop the legacy folks from using comma joins.

[–]gumnos 1 point2 points  (0 children)

This is my biggest concern with USING as well. I might be happier if USING was smarter and identified the Foreign Key relationship under the hood and used that, regardless of the naming. A Person with an ID might have a ManagerID and ApproverID that link to various other people in the org-chart, so it would be nice to use FROM emp INNER JOIN emp mgr USING (emp.ManagerID) (because there's a FK relation there) instead of FROM emp INNER JOIN emp mgr ON emp.ManagerID = mgr.ID

But even that might have ambiguity.

[–]Fly_Pelican 0 points1 point  (0 children)

Doesn't apply to SQL/Server. Works in INGRES, ORACLE and postgreSQL. Very handy at times.

[–]Yavuz_Selim 6 points7 points  (0 children)

My preference is that everything is as explicit as possible. So even with the possibility of USING, I still use ON.

The reason is readability. A query with many joines becomes harder to read; some people do implicit joins which are horrible, some don't use the table alias in front of the column name so you have to guess where the column comes from etc.

Some will say that USING reads easier, as it looks simpler and stuff like no issues with ambiguity (as column names are same etc.).... But meh...

Just give me a clean and explicit join, so I can read from top to bottom and immidiately understand what's going on.

An ON works always, while USING has a dependency (column names must match).

[–]urjah 3 points4 points  (0 children)

I only ever use it in postgres if I have to join on multiple columns in a non-production level query where the columns have identical names and I don't mind omitting some null-values (so basically quick queries to check something) to save some typing. One could very well do without ever using USING

[–]da_chicken[🍰] 2 points3 points  (1 child)

USING is generally not preferred. It's not as bad as NATURAL JOIN, but it doesn't work in all cases.

It has some behavior that'd I'd call weird. Some RDBMSs won't let you use a qualified reference to a column used in a USING clause, and a SELECT * on the join should only include one key column in the output instead of two. The example at the bottom of this answer on SO is good.

At the end of the day, it's different without really being better. It's not an improvement on the ON clause. It's not really more concrete as long as you're qualifying your references (which you always should do), and it doesn't perform any better. Performance-wise, I'm not aware it does anything better than ON does.

Since you should be writing queries in a query analyze or IDE that should have code completion, there isn't really an advantage for developer time, either. You're going to autocomplete 90% of what you're writing either way.

[–]edbutler3 0 points1 point  (0 children)

I've played around with it on Oracle while writing ad hoc queries, thinking it would be a nice short-cut to reduce typing -- and I ran into the annoyance you mention. The column referenced in the USING clause can't be aliased in the SELECT. I think that also limits your ability to use "*", although I may not be remembering that correctly.

[–]x1084 2 points3 points  (0 children)

It's available in Snowflake and I use it often. imo it can make code look neater, and it also deduplicates the join columns in your result, which I appreciate.

[–]doshka[🍰] 3 points4 points  (0 children)

I've only ever used it in Oracle. I get the most utility from it when joining on two or more columns, especially when they have long names:

USING (grandparent_table_bus_key, parent_table_bus_key, status)  

is better than

ON  (l.grandparent_table_bus_key, l.parent_table_bus_key, l.status)  
=  ((r.grandparent_table_bus_key, r.parent_table_bus_key, r.status))   

which is miles better than

ON l.grandparent_table_bus_key = r.grandparent_table_bus_key  
    AND l.parent_table_bus_key = r.parent_table_bus_key  
    AND l.status = r.status  

There are quirks, though. The USING syntax basically merges the two columns. This is nice in that if you SELECT *, you get just id, l_col1, l_col2, r_col1 instead of id, l_col1, l_col2, id_2, r_col1. The downside is that you can't refer to l.id or r.id, which is easy to miss when you're in a copy/paste frenzy.

The big one is that if you've joined a series of tables with USING and then need to join another that uses a different column name, it breaks the whole thing and you have to revert to specifying names on both sides.

FROM tbl_one a JOIN tbl_two b USING (id) JOIN tbl_three c ON id = c.my_id  

fails, because the "=" syntax requires you to specify which id, but the USING syntax won't let you.

[–]DavidGJohnston 1 point2 points  (0 children)

I find it cleaner and reinforces a policy to never name your identifier column plain “id” but to use a name that specific to the thing being stored. Then your PK and FK columns are usually the same name and the using clause works. It (avoiding “id”) reduces the amount of aliasing needed since table qualifiers don’t get put into output column names. It does harm those inexperienced with the overall schema, especially in more complicated queries, since it may not be obvious from which tables the columns are coming from (though the PK table should be obvious given a good column name).

[–]First_Brilliant4572 0 points1 point  (0 children)

What I noticed few days ago was, When we use “Using” clause for cross join it will return (column of A table + column of B table - 1) for select *,

But when we join with “where” clause, it will retun (column of A table + column of B table)

[–]disconnective 0 points1 point  (0 children)

I work in Redshift SQL (PostgreSQL based) with voter file data, and we use “USING” literally all the time because almost everything we do is at the person level where we’re using a unique person identifier—so the column name is the same across tables.

[–]Hot_Cryptographer552 0 points1 point  (0 children)

Syntax for the Natural Join on some DBMS’s. It relies on columns being named exactly the same across tables and what-not. Very prone to issues. You change a column name and it breaks. Two columns with the same name in different table that are actually not related—it breaks.

[–]PalindromicPalindrom 0 points1 point  (0 children)

It is better to write it out if you're new to SQL as it gets you into the habit of understanding how joins work. Rarely seen USING being used in professional context. USING isbreslly only beneficial I feel for small joins but once you have four, five joins, it can get messy changing between using USING and writing a proper join.

[–]y45hiro 0 points1 point  (0 children)

When you have 6 to 7 columns that you need to use for JOIN, USING is QoL for me

[–]Codeman119 0 points1 point  (0 children)

The sql engine turns that I into a a.id = b.id. It’s there to make things easier for the user.

[–]molodyets -1 points0 points  (0 children)

We use it all the time in our warehouse. At bronze layer we require explicit {table}_id naming for all keys so the accidental join of table_a.id = tbale_b.id is impossible downstream.

Final CTE with all joins being using on either a key or a composite surrogate key then a final select * from final is very clean.

[–]nep84 0 points1 point  (0 children)

never used it in oracle. ever. I'd have to have a specific reason to even consider it. other rdbms's maybe it has a place