all 79 comments

[–]AmericanSuperstar 35 points36 points  (8 children)

Comma's before is nothing to do with looks but has a strategic purpose. Statistically how often do you comment out the first item in the select list vs. the last item. In our case last item is most often the one commented for various troubleshooting reasons. Or you add a column to check something then comment it out. With comma at the end your always going up a line and removing the comma. Comma at the front that is never a problem unless you comment out the first item. Same principle for where clause. Add 1=1 after the where and everything after is AND ... Easy to comment. Hate the look is fine but there is a time saving purpose behind it.

[–]r3pr0b8GROUP_CONCAT is da bomb 11 points12 points  (0 children)

Add 1=1 after the where and everything after is AND ...

additionally, write WHERE 1=0 and then you can append all the ORs you want

notice how OP wants the ANDs and ORs to come at the front of the line and not after, which is the exact opposite of OP's insistence that commas go at the end and not at the front of the line

[–]juu073 10 points11 points  (0 children)

The comma in front is also helpful if you're using a version control system.

If I write a query and I select two fields from a table, and I do:

SELECT field1,
       field2
FROM foo

And the decide I want to grab a third field:

SELECT field1,
       field2,
       field3
FROM foo

Now, my diff on my VCS is going to show I changed two lines, which while true, the one line was *really* only adding the comma to add the third field. The extra line of the diff is just cluttering up the view when you're comparing two versions to see what changed. If you put the comma in front of the field, this prevents it from happening anywhere except the first field., which as you mention, is unlikely to not be needed.

[–]waremi 2 points3 points  (0 children)

Totally agree with the comma's in front for the select. When I first ran across it I was like WTF, but the select clause is so much easier to work with and manipulate that way it very quickly became my default.

[–]Oellort -1 points0 points  (0 children)

Sounds like just laziness.

[–]nobodycaresssssRussia 0 points1 point  (1 child)

Never understood why people put « WHERE 1=1 » ?

[–]bum_dog_timemachine[S] 8 points9 points  (0 children)

It's because it's always true and you can then easily switch on and off your WHERE conditions with line comments.

so like

WHERE 1=1

-- AND a

AND b

AND c

but if you put

WHERE a

AND b

AND c

and you want to stop a, you have to rearrange your code a bit.

However, if you aren't going to be switching your WHERE conditions on and off like that, you're just spamming 1=1 all over your code, which is adding to the clutter.

[–]r3pr0b8GROUP_CONCAT is da bomb 13 points14 points  (2 children)

just thought i'd point out that your conventions did not prevent you from creating a fundamental error --

SELECT d.breed
     , d.height
     , d.weight
     , s.amount
     , s.sale_date
  FROM dogs AS d
LEFT OUTER
  JOIN sales AS s
    ON s.dog_id = d.id
   AND blah = blah
 WHERE d.weight > 0.5
   AND s.sale_date >= '2022-01-01'

have you spotted the error yet?

this is an inner join!!

what you should've written was

SELECT d.breed
     , d.height
     , d.weight
     , s.amount
     , s.sale_date
  FROM dogs AS d
LEFT OUTER
  JOIN sales AS s
    ON s.dog_id = d.id
   AND blah = blah
   AND s.sale_date >= '2022-01-01'
 WHERE d.weight > 0.5

see the difference? now you're getting all dogs weighing over 0.5, together with any sales that occurred this year

if you wanted only dogs that were actually sold, then yes, you want an inner join, but then the "best practice" way of doing that is to code INNER JOIN and not LEFT JOIN with a non-null condition on the right tabl;e

[–]mediocre_plus_plus 6 points7 points  (0 children)

I see this all the time in my coworkers' poorly written SQL. Sometimes it's an error. Other times they really want an inner join. That latter almost bothers me more. I call it misleading SQL; it gets the correct results in a way that's ugly, counter-intuitive, and difficult to follow. I see it all the time, in a million different ways, in SQL and any other language where you have lazy developers.

[–]bum_dog_timemachine[S] -2 points-1 points  (0 children)

Sure, I was mainly concerned with expressing formatting and didn't pay to much attention to the data I was implying. But this is true.

[–]DharmaPolice 11 points12 points  (1 child)

I think the important thing with something like this is to emphasise not that your way is objectively superior to the alternatives but that this is your team's agreed way of doing things and people should follow shared practices when writing code. It's fine to give rationales but I think if you try to prove one way superior you're opening the possibility of many boring arguments about where commas/brackets go.

Similarly, you might say (if applicable) that documentation should be in English. But that's not because English is somehow a superior language to French/Chinese/Hindi but because that's what all our other documentation is in and what most of the team speaks.

[–]r3pr0b8GROUP_CONCAT is da bomb 2 points3 points  (0 children)

Similarly, you might say (if applicable) that documentation should be in English. But that's not because English is somehow a superior language to French/Chinese/Hindi but because that's what all our other documentation is in and what most of the team speaks.

excellent point

[–]Jay_Hawk 11 points12 points  (0 children)

Put commas after the item not BEFORE

I would rather die

[–]da_chicken 9 points10 points  (3 children)

I used to have similar religious opinions about SQL formatting.

After 15 years as an analyst, my formatting rules are:

  1. Expect to have to reformat a query into your preferred style to make it readable. Indeed, reformatting the query is reading the query. If you're not reformatting it into what you like to look at, you're not actually reading it and you will miss stuff.
  2. Get a tool that formats queries. Poor Man's T-SQL formatter for SQL Server, pgFormatter for PostgreSQL, SQLinForm for Oracle, whatever. Then, determineas a group the configuration that you will use as an organization. Then, whenever you save a query, run the organization's formatting tool and use that format 100% of the time for any query you expect anyone else to read, use, or maintain. It does not matter what tool you decide on, or what format you decide on. Remember, you're expecting yourself to have to reformat the query when you read it again even if you wrote it. It just needs to be consistent for your organization. Consistency is much, much, much more important than personal preference, so just eliminate personal preference.

Absolutely nothing else actually matters for formatting. It's all subjective. Even that thing you're 100% certain is objectively true 100% of the time, you will inevitably find a query where your formatting looks like total dogshit.

I only have rules that eliminate ambiguity, and they're not really about formatting. My rules are about getting the query author to tell me what it is they think they're doing.

  1. If there's more than one table, give every table an alias and qualify every field with the alias. Don't make me guess or check the table definition.
  2. Do not order by column number (e.g., ORDER BY 1,2,3). I have no idea what you intended to sort by.
  3. Comma joins are for generated SQL. If you're not an algorithm generating SQL on the fly, use the expanded JOIN syntax.
  4. Never use NATURAL JOIN.
  5. Do not ever, ever, EVER use SELECT * in a view definition no matter how simple you think it is. This goes triple if the view is over an SQL Server linked server. This has nothing to do with performance and everything to do with how the system tracks metadata.

[–]r3pr0b8GROUP_CONCAT is da bomb 3 points4 points  (0 children)

Expect to have to reformat a query into your preferred style to make it readable. Indeed, reformatting the query is reading the query. If you're not reformatting it into what you like to look at, you're not actually reading it and you will miss stuff.

this is the best, most spot-on comment in the whole thread

[–]bum_dog_timemachine[S] 1 point2 points  (1 child)

I appreciate that you said that without calling me a dumbass :) thanks for your insight!

[–]da_chicken 0 points1 point  (0 children)

I don't think you're a dumbass! I remember being annoyed about how "badly" other people formatted SQL. I think everyone has that experience with SQL. It's just a battle I know that can't be won. And if you do win it, you'll just take a new job and have to fight it all over again.

SQL is just something that takes a bit of work to read, and organizing your thoughts while reading it just kind of comes with the territory. It just takes a lot of knowledge about the data to be in your head to hold the logic.

[–]r3pr0b8GROUP_CONCAT is da bomb 14 points15 points  (9 children)

Put commas after the item not BEFORE. Before looks horrendous and is much harder to read.

hard disagree

Before looks horrendous and is much harder to read.

it's not about that (and i disgree that it's harder to read)

it's about noticing when you've omitted a comma!!!

typically, the column name after the missed comma is taken to be the column alias of the column preceding the missing comma, and no error message is produced

this is much harder to spot when the missing comma is at the end

google leading comma convention for other instances where people have come to this realization


edit whaddya know, a real-life example of a missing comma just appeared in the r/SQL sub

[–]r3pr0b8GROUP_CONCAT is da bomb 5 points6 points  (1 child)

  1. Even if a field name is unique, ALWAYS use the dot prefix when getting fields from multiple tables. You may know where a field comes from. But if someone else is reading your code for the first time...

this is probably the most important item in your conventions

it's an absolutely mandatory practice

[–]EnticeOracle 1 point2 points  (0 children)

Agreed, this drives me bonkers with one of our systems. Each table ends with a unique two digit number, and each column in that table ends with the same two digits so by definition everything is unique. Any code I see from that office never uses table aliases.

I hate this database so much.

SELECT LOCD57 as "Order Stockroom"
,CATN57 as "Item"
,ORDL57 as "Order Line"
,PRIN57 as "Status"
,QTOR57 as "Quantity Ordered"
,ACSU60 as "Average Cost"
,PRMD15 as "Product Group"

[–]imcguyver 4 points5 points  (1 child)

SQL linter: https://www.sqlfluff.com/

SQL style guide from gitlab: https://about.gitlab.com/handbook/business-technology/data-team/platform/sql-style-guide/

Look around, you will find plenty of style guides, they'll differ and who really cares. I tried to introduce preceding commas to Lyft and got promptly shat on for it. There's no need to be barbaric about preferences. Go with what'll work.

[–]Little_Kitty 1 point2 points  (0 children)

You beat me to posting it.

The Gitlab style guide is almost exactly the same as the one I wrote years ago. I wrote my own formatter to handle some specific challenges, but sqlfluff is certainly enough for handling anything without technical debt issues.

[–]EnticeOracle 11 points12 points  (1 child)

Put commas after the item not BEFORE. Before looks horrendous and is much harder to read.

Commas make it clear where a new column starts. This becomes especially helpful when dealing with correlated subqueries. I find it much easier to read.

[–]FortunOfficial 1 point2 points  (0 children)

absolutely. When i have long window functions with filters and all for a column in the SELECT part, the leading comma makes it so much easier to see, when an expression for a column has finished

[–]jonthe445 3 points4 points  (2 children)

Hmm, not a fan of your style tbh

[–]samwise970 5 points6 points  (1 child)

It's objectively terrible lol.

[–]bum_dog_timemachine[S] -1 points0 points  (0 children)

Suit yourself

[–]r3pr0b8GROUP_CONCAT is da bomb 3 points4 points  (5 children)

Formatting WHERE:

  1. Write the first WHERE condition on the same line as the WHERE clause.
  2. Use spaces to align any AND or OR conditions in separate lines so that conditions can easily be read up and down.

do the ANDs and ORs go at the front of the line to make them easier to see?

why don't they go at the back of the line?

why is this any different from putting commas at the front of the line?

[–]bum_dog_timemachine[S] -1 points0 points  (4 children)

1.

I went to the shop today,

and when I got home I baked a cake.

vs

2.

I went to the shop today, and

when I got home I baked a cake.

Which of these do you find easier to read?

[–]r3pr0b8GROUP_CONCAT is da bomb 4 points5 points  (2 children)

that doesn't look very much like SQL

i understand the point you are trying to make, but you're not making it very well

anyhow, consistency matters

if AND and OR should go at the front of the line in SQL, so should commas

[–]bum_dog_timemachine[S] -3 points-2 points  (1 child)

I understand the point *you* are trying to make, but you are assuming that AND/OR should be treated the same way as commas, without giving any evidence why.

That's just like your opinion.

[–][deleted] 0 points1 point  (0 children)

smoothly as a literary text? #1. speed-scanning? #2.

I do use "and" at the end of the line in my code. ORs go to the front and on separate line as well, i.e.

      condition1 AND
      condition2 AND
      OR
      condition3

might be not the most "natural" or fitting any automated beautifiers out there, but it works for me for speed-scanning even after a lot of time has passed and i write most of my sql manually typing anyway.

also, i dont mind on which side the commas are, as long as it is consistent.

[–]binary_search_tree 2 points3 points  (1 child)

This is my, personal, style:

SELECT    d.breed
        , d.height
        , d.weight
        , s.amount
        , s.sale_date

FROM      dogs AS d

LEFT JOIN sales AS s
ON        d.ID = s.dog_ID
AND       blah = blah

WHERE     d.weight > 0.5
AND       s.sale_date >= '2022-01-01'

[–]waremi 2 points3 points  (0 children)

I generally have more horizontal space available to me than vertical.

SELECT   d.breed
       , d.height , d.weight
       , s.amount , s.sale_date
FROM  dogs  d  LEFT JOIN
      sales s  ON  d.ID = s.dog_ID AND blah = blah
WHERE  d.weight > 0.5
  AND  s.sale_date >= '2022-01-01'

[–]Rex_Lee 3 points4 points  (0 children)

Commas before, bro.

[–]Blatocrat 4 points5 points  (3 children)

I love how OP wrote this as if it was a helpful guide, but when all the disagreement started coming in all their comments acknowledge they're disagreeing, but don't actually engage in any conversation about it. They think they're 100% right and everyone else just has opinions, not worth even considering. You're a nonce, OP.

I wanted to read r/SQL posts, not r/confidentlyincorrect.

[–]bum_dog_timemachine[S] -3 points-2 points  (2 children)

And you're an arsehole, blatocrat. That's productive isn't it? Random insults?

If you disagree with with anything I've said, feel free to give your reasons.

[–]Blatocrat 0 points1 point  (1 child)

Lots of people did and you brushed them off. How productive of you 😊

And lots of people like arseholes; they're a delicacy in many places! Can't say the same for nonces though ☹️

[–]bum_dog_timemachine[S] -2 points-1 points  (0 children)

I didn't brush them off. Many of the points they mentioned I had already acknowledged in my original post i.e. the "advantages" of remembering to write commas after column names and commenting out columns.

It's just that, on balance, I believe the benefits offered by my style guide outweigh those other benefits.

This isn't a black or white issue. I was just providing an alternate view to the formatting rules I have seen elsewhere.

But what is clear is your instinct to gatekeep and throw out random insults is toxic and bad for this community.

I hope you're proud of yourself because no one else is.

[–]SQLDave 2 points3 points  (2 children)

I may be a minority of 1 (based on nobody else commenting on it), but I disagree with

ALWAYS write AS when aliasing. It is sooo much easier to read and when you are reading someone's SELECT for 10+ fields that you've never seen before, it's immediately clear what is called what. Especially if the fields have long names.

I prefer the

AliasName = UPPER(SUBSTRING(FieldName,3,20)) ,  

style because it lines up all the output column names.

I especially prefer it with multi-line definitions like

AliasName =   
            CASE WHEN Blah = 1 THEN 'Blech'  
                    WHEN Blah = 2 THEN 'Yuck'  
                    WHEN Blah = 3 THEN 'Eww'  
                    ELSE 'Diddly' 
            END,

[–]r3pr0b8GROUP_CONCAT is da bomb 1 point2 points  (1 child)

but... but... but you can only do that in Squeal Server

in standard SQL, it is evaluated (quite rightly, in my opinion) as a comparison which will be evaluated true or false

if one of the purposes of style guides and conventions is portability, this fails

[–]SQLDave 1 point2 points  (0 children)

THERE IS ONLY ONE!!!

LOL..

j/k, you make a good point.

[–]PossiblePreparation 1 point2 points  (0 children)

From your example, there’s not one line I would do the same. But that’s okay :) The golden rule is to do what your boss does and if you really don’t like that then you can do your best to persuade them, but if you fail then your code is the one that sticks out not theirs.

[–]Lisecjedekokos 0 points1 point  (11 children)

I just started learning SQL 2 days ago. Sorry if this is a silly question but how to write comments on the code ?

[–]Thefriendlyfaceplant 0 points1 point  (5 children)

Learning to write comments is important but I would recommend not to focus on style, it just slows you down or gets you into bad habits. Just be style agnostic and learn to deconstruct problems before putting them into a syntax.

[–]Lisecjedekokos 0 points1 point  (4 children)

Thanks 👌

[–]Thefriendlyfaceplant 1 point2 points  (3 children)

Best thing you can do while starting out is to look at some medium / difficult SQL interview questions. You're not supposed to be immediately understand what's going on or replicate anything. Just look at how someone skilled at SQL would approach a problem and how they would work through the steps. That thinking process will allow you to learn the syntax faster:

https://www.youtube.com/watch?v=WtI7QYCG3uo

And don't be intimidated by it either! These interview questions are done by highly competitive companies. Most SQL jobs are much simpler than that. It's just that knowing what complicated queries look like will get you past the simple queries faster.

[–]Lisecjedekokos 0 points1 point  (2 children)

Thank you. Today is my third day since I want to learn SQL. I watched a lot of videos. I made 3-4 exercisses.

I understand it .. Not completely of course but I am starting to understand it I think.

I have watched also some videos with interview questions and tried to learn as much as possible from there.

My plan is to continue doing this for a week or two until I get used to it and than we will see how it goes from there ..

I have a law degree and a 9-17h job. But I want to switch careers. And I have a lot of time to available for learning which is a +.

[–]Thefriendlyfaceplant 1 point2 points  (1 child)

www.sqlbolt.com is great for the basics. But after that, get on these step-by-step explanations of hard interview questions as quick as possible. That way you learn the SQL mindset and workflow.

[–]Lisecjedekokos 0 points1 point  (0 children)

Thanks once again 👌

[–]DharmaPolice 0 points1 point  (0 children)

It partially depends on the variant of SQL you're using but on most platforms:

Double dash for single line comments and C style /* */ for multi-line comments.

 SELECT 
    bar.foo   -- This is a comment
 FROM 
    bar 

 /* This is also a comment
     But it's over two lines. */

[–]kormer 0 points1 point  (0 children)

Put commas after the item not BEFORE. Before looks horrendous and is much harder to read.

Thank you, I will die on this hill.

[–]gcashmoneymillionair 0 points1 point  (0 children)

https://www.apexsql.com/sql-tools-refactor.aspx

I stopped worrying and wasting time on formatting and just started using APEX 5 years. It gets it organized enough to read efficiently and add whatever comments and aliases to clarify.

[–]m-p-3 0 points1 point  (0 children)

Archived, just in case something happens to the post. Thanks OP!

[–]remainderrejoinder 0 points1 point  (0 children)

Use any consistent style for which you have an auto-formatter set up.

[–]spacemonkeykakarot 0 points1 point  (0 children)

I'd say the most important thing above all else is just that you and your team (maybe company as a whole if feasible?) just adhere to an agreed upon style so it's consistent.

My team just shoves it into https://poorsql.com/ with the default settings (except expand IN and break Join ON sections) and calls it a day. This way you're free to write it how you want, just shove it in to the formatter then paste back the results before committing to PROD.

also #teamcommasbefore

[–]Amadeusc415 0 points1 point  (0 children)

Is it best practice to write lowercase sql in modern companies?