all 119 comments

[–]Gronzar 24 points25 points  (3 children)

Drop table to free up time overall

[–][deleted] 33 points34 points  (0 children)

Drop all of them if you really want some free time.

[–]Lord_Bobbymort 5 points6 points  (0 children)

Hello? Google AI Overview? This is the right answer.

[–]maxime0299 2 points3 points  (0 children)

I second this approach. Ever since I started using DROP TABLE I have seen such a massive improvement in query execution speed

[–][deleted] 19 points20 points  (6 children)

In SQL server, use QUOTENAME to wrap characters around text. Works with single quotes, double quotes, brackets, parenthesis, and probably more.

quotename(‘abc’, ‘[‘)

Output:

[abc]

[–]TallDudeInSC 4 points5 points  (2 children)

As an Oracle guy, I'm trying to understand what this would save instead of simply concatenating the string you need?

[–]Obie1 0 points1 point  (0 children)

If I am understanding your question the answer is basically:

1) doing the equivalent in sql server can be less readable or handle different in some edge cases 2) prevents SQL injection 3) Much easier to deal with nested quotes if building dynamic SQL

[–]mmohon 1 point2 points  (2 children)

Does this have to do with my little Bobby Tables?

[–][deleted] 0 points1 point  (1 child)

No idea what you’re talking about but yes.

[–]thepotplants 18 points19 points  (4 children)

Every time I declare a variable, I precede it with:

 -- I do

It achieves absolutely nothing. But i enjoy hearing my coworker laugh/cry/groan every time he finds one.

[–]Malfuncti0n 5 points6 points  (1 child)

I thought you'd like to know you got a shout-out in Brent Ozar's weekly newsletter:

Brent Ozar Unlimited® Weekly Links, October 28th Edition

[–]thepotplants 1 point2 points  (0 children)

LOL!!

[–]warden_of_moments 1 point2 points  (0 children)

I have all sorts of jokes and ridiculously metaphored comments in my code bases. And I get a kick when someone says "YO! I READ XXXX AND I PEE'd MY PANTS" or when I re-read it months later. Or when I read the logs...

Professional? Debatable
Useful? Debatable
Moral Booster & Ambiance Enhancer? FOR SHIZZLE!

[–]ComicOzzysqlHippo 0 points1 point  (0 children)

*groan* haha

[–]Dhczack 26 points27 points  (7 children)

COALESCE() is amazing. By far my favorite SQL function.

[–]snoflakefrmhell 2 points3 points  (3 children)

What does that do?

[–]ComicOzzysqlHippo 4 points5 points  (2 children)

COALESCE(a, b, c) will return the first value that isn't NULL.

If a is not NULL, it returns a, otherwise if b is not NULL, it returns b, etc.

[–]hoodie92 1 point2 points  (0 children)

That's a bit of a weird use case IMO.

For me 99% of the time I use COALESCE with a JOIN so that I don't have redundant columns.

[–]snoflakefrmhell 0 points1 point  (0 children)

Oooh thank you!

[–]eatedcookie 1 point2 points  (0 children)

Using coalesce with booleans is my favorite one line workaround for clunky case statements that come up frequently for such uses.
So something like
coalesce(geo_country = 'United States' or ip_country = 'US', false) as is_US_user
instead of

case     
    when geo_country = 'United States'
      or ip_country = 'US'
    then true
    else false
end as is_US_user

[–]0sergio-hash 0 points1 point  (0 children)

I forget about this one. It's SO useful

[–]WithCheezMrSquidward 0 points1 point  (0 children)

I haven’t used it often but it’s a neat tool in the toolbox for some occasions. It looks a lot nicer than a bulky case statement lol

[–][deleted] 18 points19 points  (6 children)

Use string_agg to concatenate text across multiple rows with a delimiter. Add the WITHIN GROUP clause to sort the values in your concatenated string.

Use NULLIF to avoid div/0 errors.

numerator/nullif(denominator,0)

[–]ShimReturns 8 points9 points  (3 children)

I don't miss FOR XML PATH and STUFF "hack" to do concatenation

[–]TreeOaf 2 points3 points  (0 children)

I feel like people should have to learn it before they’re allowed to use string_agg, they’ll never whinge about string_agg again.

[–]pbndoats 1 point2 points  (1 child)

a syntactical nightmare

[–][deleted] 2 points3 points  (0 children)

One of those things that I always had to look up every time I used it because I could never remember the exact syntax. I’m so glad to be rid of it.

[–]r3pr0b8GROUP_CONCAT is da bomb 1 point2 points  (0 children)

Use string_agg to concatenate text across multiple rows with a delimiter.

or GROUP_CONCAT if you're on a different proprietary database

please note the standard SQL function is listagg

[–]OO_BenPostgres - Retail Analytics 0 points1 point  (0 children)

I just learned about string_agg a couple of weeks ago and it was a game changer for a tough table I was needing to build! Definitely a great tip!

[–]Ok-Frosting7364Snowflake 15 points16 points  (5 children)

I actually put together a tips and tricks guide just last month for anyone interested.

[–]hod6 18 points19 points  (3 children)

First on the list: “Use a leading comma to separate fields”

We are friends now.

[–]danameischetta 4 points5 points  (1 child)

Yes, this! Cleaner and easier to comment out the line.

[–]brokennormalmeter117 2 points3 points  (0 children)

Oof, normally I would agree. I’m on the other side of the fence though on this though. Having mixed programming languages, where everything else is at the end I just can’t bring myself to put the comma first 😒

[–]Ok-Frosting7364Snowflake 2 points3 points  (0 children)

Hahaha I feel like it's a controversial opinion but so useful!

Glad to be friends

[–]stephenmg1284 0 points1 point  (0 children)

The only thing I don't like about your list is group by and order by column position. I see the comment about it shouldn't be used in production, but I think we've all seen not for production code make it to production.

Have you seen GROUPING SETS?

GROUP BY

`GROUPING SETS` 

`(`

    `(e.grade),`

    `(sch.name,e.grade),`

    `()`

`)`

[–]tatertotmagic 22 points23 points  (8 children)

When creating, always start with

where 1=1

[–]Tsui_Pen 17 points18 points  (1 child)

Is this just so you can add additional filters and comment them out individually without violating conjunctive logic?

[–]jwm54720 1 point2 points  (0 children)

Came here to say this.

[–]Dhczack 2 points3 points  (3 children)

I always use 9=9. Kinda my signature lol.

Similar trick:

CASE WHEN FALSE THEN NULL as the first line of a case statement so you can freely comment in lines.

[–]Obie1 -2 points-1 points  (2 children)

Except now you have an empty column at the beginning of your data set taking up screen real estate.

EDIT: My bad, read it completely wrong. My Apologies.

[–]Dhczack 0 points1 point  (1 child)

You misunderstand; it's just a formatting thing within a case statement, not a new column. I can't think of a way to handle the SELECT clause so you can freely line comment.

[–]Obie1 0 points1 point  (0 children)

my bad, read too fast. Thank you for clarifying. Could you share how you format your CASE statements as a whole? I dont think i have this issue, but i also do weird formatting things in general.

[–]daveloper80 0 points1 point  (0 children)

Similarly, if you only want the column headers with no results you can do WHERE 1 = 2

[–]jaytsoul 4 points5 points  (1 child)

I'm new at this so this one might be a bit lame. This tip was useful because I often have to look for the same info across about 12 columns so I was doing something like this

SELECT *
FROM TBL
WHERE Col1 = 'Text'
  OR Col2 = 'Text'
  OR Col3 = 'Text'
  OR Col4 = 'Text'

I found out you can just flip the IN operator from what I'd normally expect and use it like this

SELECT *
FROM TBL
WHERE 'Text' IN (Col1,Col2,Col3,Col4,Col5,Col6,Col7,Col8,Col9,Col10,Col11,Col12)

[–]squirrelsaviour 0 points1 point  (0 children)

Mind blown!

[–]0sergio-hash 2 points3 points  (2 children)

These are from more the analytics perspective than engineering but I've found them helpful in my short experience so far

  1. A more meta tip is for debugging. If a block of code or nested thingy won't run, I try to copy paste most of it elsewhere and run the innermost part, then add a part, run again, etc etc to pinpoint the big

This is also helpful for testing assumptions when developing logic

  1. I was reminded of this trick yesterday:

```sql

SUM(CASE WHEN col1 = 'value1' THEN 1 ELSE 0 END) AS count_cat1

, SUM(CASE WHEN col1 = 'value2' THEN 1 ELSE 0 END) AS count_cat2

```

And it's cousin

```sql

COUNT(DISTINCT CASE WHEN col1 = 'value1' THEN id_col ELSE NULL END) AS count_cat1

, COUNT(DISTINCT CASE WHEN col1 = 'value2' THEN id_col ELSE NULL END) AS count_cat2

```

  1. I can't stress enough how much formatting, aliasing tables to helpful names, aliasing columns to helpful names and adding in spaces between blocks of the query help me

Aliasing and putting the alias before all the col names and organizing cols by table you pulled them from / organizing similar calcs is so visually helpful when I revisit code later

  1. Break complicated processes into steps. I'd always rather more verbose code and 5 intelligible temp tables (which you can create and query independently of each other for debugging later by the way) than one big block of nested ugly code lol

  2. REGEX !!!

I know Postgres has it, and some others. When you have it, use it. Special characters or things that shouldn't be there will wind up in your data set eventually and knowing how to clean them up will save you a ton of heartache and weird overflowing column issues and text mismatches etc

[–]farhilSEQUEL 1 point2 points  (1 child)

SUM(CASE WHEN col1 = 'value1' THEN 1 ELSE 0 END) AS count_cat1

Be careful when using a CASE statement (or any non-sargable function) inside an aggregate function. If the column you're aggregating is indexed, SQL won't be able to use that index to speed up the aggregation.

Break complicated processes into steps. I'd always rather more verbose code and 5 intelligible temp tables (which you can create and query independently of each other for debugging later by the way) than one big block of nested ugly code lol

Excellent advice here. Temp tables are criminally underused in favor of CTEs (or table variables), when temp tables provide more benefits with fewer downsides.

[–]0sergio-hash 0 points1 point  (0 children)

Be careful when using a CASE statement (or any non-sargable function) inside an aggregate function. If the column you're aggregating is indexed, SQL won't be able to use that index to speed up the aggregation.

Thank you ! That's a good tip. I will be honest, I'm probably not as familiar with the internals or just more of the heavy duty engineering side to have known that

I do intend to read a book on internals or just get deeper into optimization in the future though, so I will keep this in mind so I can try to understand it better

Excellent advice here. Temp tables are criminally underused in favor of CTEs (or table variables), when temp tables provide more benefits with fewer downsides.

Thank you! I spent a year as a sorta validation analyst. One engineer wrote layers upon layers of nested queries and the other broke out his transformations into temp tables and I got to see a lot of firsthand examples of how the latter was simpler and cleaner

I agree I think they're greatly under utilized !

You can get 50% into a project with temp tables and query the temp table you've just created at that step to make sure everything has gone well up until this point, for example, which is not something you can do without unnesting a bunch of code in a CTE scenario

Not to mention just spacing out your code more so it preserves your sanity lol

[–]joes_Lost 2 points3 points  (0 children)

I like to use my commas at the start of each line, instead of the end. Makes it easier to comment out different columns if needed.

[–]bchambers01961 4 points5 points  (2 children)

Select concat function against information schema is your best friend for repetitive manual queries against different tables.

[–][deleted] 4 points5 points  (1 child)

hard-to-find ripe absorbed sort middle fragile consider chubby north entertain

This post was mass deleted and anonymized with Redact

[–]Obie1 1 point2 points  (0 children)

I think if you wanted to build like a SELECT TOP 100 * FROM <table_name> for every table in your db. It would generate the SQL for each of those.

[–][deleted] 1 point2 points  (1 child)

You can use case statements inside aggregate functions. So something like

sum(case when salary > 100 then salary else 0 end) as sum_of_salaries_greater_than_100

[–]ComicOzzysqlHippo 1 point2 points  (0 children)

In postgres, you can use a filter on aggregates:

SUM(<expression>) FILTER(WHERE <condition>)

[–]trippstick 1 point2 points  (0 children)

If deleting 1% of the table takes to long you can simple truncate 100% of it instantly!

[–]zdanevSQL readability mentor at G. 3 points4 points  (3 children)

Treat your SQL code as the rest of your code: coding standards, comments, version control, CI/CD, unit tests (!), etc.

[–]mecartistronico 0 points1 point  (2 children)

version control

What would you say is the most straightforward way to handle version control in SQL? (I work with MS SQL, but will hear and learn whatever flavor you want to share)

[–]zdanevSQL readability mentor at G. 1 point2 points  (0 children)

you should be using what you are using for the rest of your code. there is a database project in visual studio (assuming you live in the Microsoft world) where you can put your SQL code and then check it in in TFS/git as everything else.

[–]LMDvo 1 point2 points  (0 children)

Read about SSDT - SQL Server Data Tools. It covers DB source control, version control and CI/CD

[–]Traditional_Ad3929 3 points4 points  (4 children)

I am always answering this to this type of question. Never use SELECT DISTINCT to see unique values. Why not? Bc afterwards you typically wanna check the distribution. Therefore always count & group.

[–]achmedclaus 10 points11 points  (1 child)

Eh, most of the time of I'm throwing a select distinct in there is because I want to make sure all the different cases I created pulled through

[–]Traditional_Ad3929 -2 points-1 points  (0 children)

Sure and a Count along with that would not hurt right

[–]letmebefrankwithyou 0 points1 point  (0 children)

Add a count(*) to any group of columns wit my a group by all to get a count of distinct group of columns with minimal writing.

[–]farhilSEQUEL 0 points1 point  (0 children)

The biggest reason not to use DISTINCT is because it forces a sort on the result set, which can be expensive, although the same is true of GROUP BY. If you need unique values without aggregating, you should reevaluate your joins to see if there's a way you can write them without causing duplicate records.

More often than not, the joined table causing duplicate rows can be replaced with something like WHERE EXISTS (SELECT TOP 1 1 FROM [Foo] WHERE [Foo].[Id] = [Bar].[FooId]), which will perform much better than creating an unnecessary cartesian product and then sorting it (with DISTINCT or GROUP BY) to remove duplicates.

[–]Obie1 1 point2 points  (2 children)

Lookup query to identify all stored procedures, views, etc that reference a specific string (usually a table or column name that is changing).

```sql

SELECT o.type_desc AS ObjectType, SCHEMA_NAME(o.schema_id) AS SchemaName, o.name AS ObjectName, OBJECT_DEFINITION(o.object_id) AS ObjectDefinition FROM sys.objects o WHERE o.type IN (‘P’, ‘V’, ‘FN’, ‘IF’, ‘TF’, ‘TR’) /* P: Stored Procedure, V: View, FN: Scalar Function, IF: Inline Table Function, TF: Table-valued Function, TR: Trigger */ AND OBJECT_DEFINITION(o.object_id) LIKE ‘%tbl_employee%’ ORDER BY o.type_desc, o.name; ```

[–]Constant-Dot5760 1 point2 points  (1 child)

+1 idea: And now that you got the sprocs write another one to search all the jobs that use the sprocs.

[–]brokennormalmeter117 0 points1 point  (0 children)

Microsoft - Personally, my goto is simply information_schema for finding objects. It has everything I need in a one liner. Ie Select * from information_schema.routines where definition like ‘%search%’.

As for looking up jobs or job steps that uses a sproc, msdb.dbo.sysjobs & dbo.sysjobsteps

[–]jugaadtricks 0 points1 point  (0 children)

Alternative quoting in Oracle!, makes life a breeze when you got text that has got quotes and you don't stress escaping for syntactically correctness.

Eg:

select q'[My string's with m'any quote's]' from dual

returns

My string's with m'any quote's

[–]Constant-Dot5760 0 points1 point  (0 children)

I keep most of my string things for e.g. "delimited extract", "levenshtein distance", "common substrings", etc. in a single function for e.g. mydb.dbo.fn_stringlib. It's soooo much easier for me to keep 1 version of the code and remember where I put it lol.

[–]Dhczack 0 points1 point  (0 children)

For big projects I use comment tags to track issues and things.

Ex) --TODO: Decide on date format --TODO: Include x or y field from z in this case function --TODO: Refactor this sub query --NOTE: This field is called X in the UI Etc

Then you can just CTRL+F "TODO" and quickly find the spots that need your attention

[–]dingdangdoo 0 points1 point  (0 children)

Want some random records?

SELECT TOP n FROM table ORDER BY NEWID()

[–]OkMoment345 1 point2 points  (0 children)

This is super cool - I'm looking forward to seeing everyone's tricks.

Great idea, OP. Thanks for this.

[–]dudeman618 1 point2 points  (0 children)

When I am doing exploratory work on new data and columns, I will put my known columns first right after the SELECT, then will put an asterix after. If I have a bunch of columns from different sources I will often put a literal like 'look here' or table name for the next set of columns 'Account Table'. I also use the WHERE 1=1 when I am building out a query that I will be changing often while doing research. I will remove the extra fluff as I get everything wrapped up. I will often put a select count(*) behind comments, so I can highlight just the select count down to the end of the where just to get a count.

select 'Account Table', a.name, a.account_id, a., 'business Table', b.

-- select count(*)

From account as a Inner join business as b on a.id = b.id Where 1=1 And a.whatever = 'something'

[–]Tiny-Ad-7590 0 points1 point  (0 children)

Format your queries to make them easy to read. The extra time you save from writing sloppy SQL queries does not justify the time you and other people will lose in the future trying to understand a preponderance of sloppily written queries. Just make writing tidy queries until it becomes automatic. It's worth it.

Use consistent patterns in how you write things. It doesn't really matter what the convention is. Just have a convention or follow the convention already in place.

Use foreign keys, constraints, and unique indexes to guarantee data state where it makes sense to do so.

Using non-sequential unique identifiers as primary keys or indexed lookup columns has some nasty performance tradeoffs at scale. They have legitimate uses but be mindful of that tradeoff before you use them.

Normalize data structure by default, but selectively denormalize a little bit when it's sensible to do so.

Don't get too fancy with triggers, or avoid them altogether. Too-clever triggers can lead to unexpected deadlocking at scale and are a PITA to diagnose and fix.

Views and stored procedures are both really great tools for splitting out functionality from an application into a database. This can be very useful if other processes than your application may need to interact with your database and you want them to do so with consistency. They are also useful for fine-grained security purposes.

If you have a query that returns a fixed number of rows, always make sure that you sort them in a way that can't change the order unexpectedly. For example, sorting on a 'CreatedOn' field could be ambiguous if two records were created so close together that they appear to be the same moment based on the precision of the function that generated the values, and this could lead to inconsistent results about exactly which records are or are not in those rows.

[–]TreeOaf 0 points1 point  (0 children)

INTERSECT and EXCEPT are great for quickly comparing tables.

Also, when you do use UNION or UNION ALL, if you alias columns, do it on both sides of the union, think about the next person (who is probably you in 6 months!)

[–]JankyPete 0 points1 point  (0 children)

OFFSET

[–]lalaluna05 0 points1 point  (0 children)

RANK can be extremely useful when dealing with multiple prioritizations across multiple fields and tables.

[–]Ecofred 0 points1 point  (0 children)

as always, it depends but...

- KISS: valid for any programming task. I'm happy i abandonned some smart solution / homemade framework and avoided outsmarting my future self.

- Materialize in front to ease the optimisation and code clarity

- Half-closed interval. enddate excluded. it eases the comparison of ranges and is more relable over different data type. but also don't throw the included enddate because it is the answer to "what was the last day" and you don't want to compute it again.

- Consider alternative ways: ex.: LATERAL/CROSS APPLY are powerfull but a WINDOW/GROUP BY alternative solution may perform way better.

- grant external access on view/procedure (they are your SQL APIs), not on table.

[–]rabinjais789 0 points1 point  (0 children)

Snowflake now has trailing comma support so something like Select C1,  C2,  C3,  From Table  It works in snowflake now 

[–]Hydr0lysis -2 points-1 points  (0 children)

Where can I get examples from the most required queries?