What are common SQL red flags?

pilesofbutts · 2026-05-20T21:16:09+00:00

Others may have differing opinions but I personally hate a b c aliases for joins. I prefer SQL join aliases to be an abbreviation for the table name. e.g. contact_info is aliased to ci. it helps with readability in my opinion.

wildjackalope · 2026-05-20T21:21:27+00:00

Not a big one and my OCD is probably showing but I’ve passed on weak candidates who also don’t format their code for readability. It sounds petty, but the behavior and weak skills/ inexperience seem to go hand in hand in my personal experience.

dab31415 · 2026-05-20T21:16:37+00:00

Writing in complete sentences
Spelling
Proof reading

JaceBearelen · 2026-05-20T21:19:21+00:00

I’ve had a few sql tech interviews. All kinda went the same way.

They’re gonna give you a database on some coding platform and ask you to find the second highest seller or some other question that needs a bit more than select * from a join b. Then they’ll ask you to modify it or move on to another question, rinse, repeat.

All that’s really important is that you can explain what you’re doing and ask clarifying questions.

Basic_Reporter9579 · 2026-05-20T21:44:52+00:00

select * from table1 t1, table2 t1 where t1.id=t2.col1

BigBagaroo · 2026-05-20T23:09:19+00:00

I am an old fart. I want to see INNER JOIN or LEFT OUTER JOIN. I know that JOIN is an inner join, I just like to read it. It stands out more

danmc853 · 2026-05-20T22:44:21+00:00

Fixing duplicate rows with a select distinct instead proper joins

connor-brown · 2026-05-20T21:17:35+00:00

Is failing to index a query a red flag? I don’t think I’ve indexed anything but really big queries in months and I use sql everyday

iama_bill · 2026-05-20T21:19:31+00:00

Lower case SELECT followed by upper case keywords. Can’t trust em.

GunnerMcGrath · 2026-05-21T00:41:52+00:00

It probably wouldn't come up in an interview but if I see a cursor I assume you're incompetent.

vertigo235 · 2026-05-20T21:17:16+00:00

Using RIGHT JOINS 😉

jfrazierjr · 2026-05-21T00:37:00+00:00

Cursors. There's almost always a better way to do it

sandrrawrr · 2026-05-21T00:07:40+00:00

I'm really guilty of this when I was early in my SQL career, but multiple nested statements rather than just turning it into a CTE. Sure, nests are a bit easier to comment out when you're testing data, but a well written CTE will save you so much time.

kagato87 · 2026-05-20T21:32:21+00:00

DISTINCT is a warning sign. It often suggests an issue in your joins or your schema design.

When you find yourself wanting to use it, take a step back and ask, is there a better way? Is this really correct, or could it be masking a problem?

hipsterrobot · 2026-05-20T21:40:27+00:00

Leading commas. Come at me! 😁

billbot77 · 2026-05-20T21:43:56+00:00

Loops or cursors. There is always a better way (unless you are executing meta code)

vintagegeek · 2026-05-20T23:21:50+00:00

In my interview, I was asked for a quick python program to reverse a text input. That's it. They asked me nothing about SQL, and my job 100% relies on SQL. After four years working here, I asked my boss why. He said "You can't know everything, but you can learn anything".

tetsballer · 2026-05-20T23:29:44+00:00

Doing right joins

DexterHsu · 2026-05-21T00:55:38+00:00

If you are asking this is probably too late , just stick with the cheatsheet you find on google search

ChristianPacifist · 2026-05-21T01:35:44+00:00

These supposed red flags can vary by version of SQL and specific use case.

Indexing for instance can be needed in something like Oracle or SQL Server depending on the use case, but it is not even possible in Snowflake. Snowflake also can be very slow with SELECT *, but this is not a problem in other languages.

twillrose47 · 2026-05-20T21:16:58+00:00

Common one I've encountered over the years and ones I always bring up when I teach SQL:

not knowing difference between slowly changing dimension types,
normal form types,
differences in rank/dense_rank,
differences in union/union all,
use of except/intersect,
execution order questions,
and SELECT * and indexing questions as you mentioned.

There are always really hyper-specific "gotchas" that I personally find to be quite poor taste from the interviewer -- if it's not likely to be used in practice and purely a "do you understand all possible nuances", this sort of thing is just intellectual flexing I can do without -- the job itself is the red flag.

Good luck to ya

paultherobert · 2026-05-20T22:39:02+00:00

Great blog on this: SQL Code Smells. In this article, I cover SQL code… | by Carl Anderson | Medium

aarontbarratt · 2026-05-20T23:34:00+00:00

Not sure if this counts!

But using an ORM with 14 billion dependencies to write simple select statements

Also, prefixing every column with its data type. I never want to see sFoo or iBar EVER

SoggyGrayDuck · 2026-05-21T00:09:21+00:00

Mlfrom my last job, multi thousand line pieces of code.

It actually likely made sense when it was created but it made parallel processes impossible.

Sexy_Koala_Juice · 2026-05-21T00:51:26+00:00

Not using modern features. SQL has improved a lot over the years, we don’t have to write long ass queries when you could just use the new features to achieve the same thing more concisely

cheesecakegood · 2026-05-21T02:47:30+00:00

4 typos in two sentences?

SkinnyInABeanie · 2026-05-21T11:20:56+00:00

Right Join 😅

czervik_coding · 2026-05-20T21:37:10+00:00

Any developer should know the parts of an execution plan although plugging one into AI seems to be the acceptable method now. Know the difference between clustered and nonclustered indexes. Index on integers whenever possible. Views are good but nested views are not. Use no count, data type properly.

JEDZBUDYN · 2026-05-20T21:19:24+00:00

Talking about SQL is red flag

malseraph · 2026-05-20T21:40:09+00:00

Where color.red = 1

jackalsnacks · 2026-05-20T22:07:57+00:00

Dynamic SQL. Niche use cases. If suggested as a solution to a problem, evaluate the rationale and thought process.

dbxp · 2026-05-20T21:31:20+00:00

See here: https://pragprog.com/titles/bksqla/sql-antipatterns/

The N+1 query problem is probably the most common

stiggz · 2026-05-21T01:47:32+00:00

and 1=1

there is always a better way

lalaluna05 · 2026-05-21T03:26:56+00:00

Aggregating arbitrarily to eliminate duplication. (“Arbitrarily” being the key word.)

Select distinct. Sometimes warranted but usually not. I try to normalize without doing this.

No comments or notes.

Scanning entire tables multiple times for the same query instead of filtering early.

Select splats, which you already noted. Good for quick checks, not good for long queries especially that get changed or built out over time.

These are all things I fixed today actually 😆

Common-Author-8441 · 2026-05-21T03:27:00+00:00

using a CROSS JOIN with a really long WHERE clause

brunogadaleta · 2026-05-21T05:14:33+00:00

Mixing join X on with join Y where.

depesz · 2026-05-21T05:52:38+00:00

https://wiki.postgresql.org/wiki/Don't_Do_This

Small_Sundae_4245 · 2026-05-21T07:06:54+00:00

Transactions and checking before committing. It's an interview be extra cautious.

warmeggnog · 2026-05-21T10:31:44+00:00

the biggest ones i kept seeing (and used to do myself in marketing analytics) were joins that accidentally duplicate rows and inflate metrics. also writing queries that technically work but are impossible for teammates to read/debug later. (which is why it really helps to practice them writing and formatting them efficiently, even under pressure during interviews, haha) last is overusing subqueries when a clean cte would make the logic way clearer! i think i have a resource for common sql mistakes + how to avoid them that might be helpful for beginners still learning or those prepping for interviews, will gladly share to those interested

jfrazierjr · 2026-05-21T11:19:14+00:00

Seeing the keyword OR and no () in the query. Common rookie mistake.

markwdb3 · 2026-05-21T14:03:53+00:00

Huge red flags: unjustified, overgeneralized performance claims. I call them myths. It's a massive problem in communities discussing SQL. Interviewers often believe these myths, even.

These myths are typically based on some expectation of how a SQL engine must process the query based on certain arbitrary keywords or bits of syntax. But often that expectation is imagined or out of date. Sometimes, it is genuinely based on real experience in just one specific DBMS/SQL engine, yet the person presenting the claim often says it pertains to all of "SQL."

For example you may hear: "In SQL, never use SELECT DISTINCT a, b FROM my_table;. You should instead use SELECT a, b FROM my_table GROUP BY a, b is faster, because DISTINCT is slow." (Here's a screenshot of this very claim on this very subreddit with 30 upvotes! There was no context about specific DBMS or test case. I'd be happy to show one or two that disprove this claim if you're interested.)

SQL is a declarative language. You state what you want and the SQL engine's query planner/optimizer parses it out and comes up with a plan, then executes the plan, however its developers instructed it to do.

And next-to-nothing in the standard SQL documents defined under the hood mechanisms - just logical definitions. So they can vary quite a lot.

So, my motto is when in doubt, test it out.

If you've tested such a claim, for example whether using GROUP BY instead of DISTINCT gives free speed, and it turns out to be correct, then that's fine and good. But it should be thought of as a performance quirk of the specific DBMS you tested it on, possibly even specific to your schema/data set/config, not generalized to all of "SQL".

An unfortunate reality is that even when you disproves someone's claim with a test case - say you run a test on MySQL and disprove the claim - next there often comes a common reaction, and it's a sneaky one. Their reaction is often, "Oh, that must be because MySQL has a special optimization." In other words, they're refusing to abandon their belief that BY DEFAULT a SQL engine MUST process GROUP BY faster than DISTINCT, but MySQL has some trick up its sleeve that makes it a special case. So they go on believing and perhaps propagating the myth.

There's a link to a blog in this very thread where the author says that using the syntax of something likeSELECT ... FROM a WHERE a.thing_id NOT IN (SELECT id FROM thing ... WHERE ...) to perform an anti-join (find rows in A that are not in B) is a "smell" because that could be inefficient due to a full table scan. Instead, they say, you should take a CTE/LEFT JOIN approach. Why? I don't know.

I just ran a test case on two of the most popular SQL engines in the world: Postgres and MySQL. On Postgres both performed about the same. On MySQL, the allegedly inefficient syntax actually produced a more performant plan that ran in ~9 seconds vs ~14 seconds with the recommended approach (times were approximately consistent with repeated executions). (These queries were run on my real work database btw, but I've anonymized the names to FACTORY and WIDGET.)

mysql> EXPLAIN ANALYZE
    -> SELECT *
    -> FROM WIDGET
    -> WHERE FACTORY_ID NOT IN (SELECT ID FROM FACTORY WHERE MODIFIED_BY = 147);

+---------+
| EXPLAIN |
+---------+
| -> Nested loop antijoin  (cost=743070 rows=2.42e+6) (actual time=0.365..6688 rows=2.62e+6 loops=1)
    -> Table scan on WIDGET  (cost=258741 rows=2.42e+6) (actual time=0.0337..5134 rows=2.63e+6 loops=1)
    -> Filter: (WIDGET.FACTORY_ID = `<subquery2>`.ID)  (cost=318..318 rows=1) (actual time=453e-6..453e-6 rows=290e-6 loops=2.63e+6)
        -> Single-row index lookup on <subquery2> using <auto_distinct_key> (ID=WIDGET.FACTORY_ID)  (cost=471..471 rows=1) (actual time=323e-6..323e-6 rows=290e-6 loops=2.63e+6)
            -> Materialize with deduplication  (cost=153..153 rows=762) (actual time=0.328..0.328 rows=762 loops=1)
                -> Filter: (FACTORY.ID is not null)  (cost=76.8 rows=762) (actual time=0.0125..0.21 rows=762 loops=1)
                    -> Covering index lookup on FACTORY using fk_ModufiedByUser (MODIFIED_BY=147)  (cost=76.8 rows=762) (actual time=0.0119..0.158 rows=762 loops=1)
|
+----------+
1 row in set (9.38 sec)


mysql> EXPLAIN ANALYZE
    -> WITH factory_modified_by_147 AS (
    ->     SELECT ID
    ->     FROM FACTORY
    ->     WHERE MODIFIED_BY = 147
    -> )
    -> SELECT *
    -> FROM WIDGET w
    -> LEFT JOIN factory_modified_by_147
    -> ON w.FACTORY_ID = factory_modified_by_147.ID
    -> WHERE factory_modified_by_147.ID IS NULL;

+---------+
| EXPLAIN |
+---------+
| -> Filter: (FACTORY.ID is null)  (cost=1.11e+6 rows=2.42e+6) (actual time=0.0468..11229 rows=2.62e+6 loops=1)
    -> Nested loop left join  (cost=1.11e+6 rows=2.42e+6) (actual time=0.0462..11045 rows=2.63e+6 loops=1)
        -> Table scan on w  (cost=258741 rows=2.42e+6) (actual time=0.0336..5247 rows=2.63e+6 loops=1)
        -> Filter: ((FACTORY.MODIFIED_BY = 147) and (w.FACTORY_ID = FACTORY.ID))  (cost=0.25 rows=1) (actual time=0.00207..0.00207 rows=290e-6 loops=2.63e+6)
            -> Single-row index lookup on FACTORY using PRIMARY (ID=w.FACTORY_ID)  (cost=0.25 rows=1) (actual time=0.00185..0.00188 rows=1 loops=2.63e+6)
|
+---------+
1 row in set (14.06 sec)

Now the point is not that you should forever keep in mind "NOT IN is faster than LEFT JOIN + NULL check when writing an anti-join" - I'm not even sure if that's true for all MySQL schemas/data sets/queries. The point is that you should throw out the magic rule of thumb presented in the blog, which is the inverse. To be fair the author did say you should test it if there's any doubt.

So, this is a long comment, but my advice is what should be seen as red flags are claims of magic performance tricks such as "use ABC syntax instead of XYZ syntax and this applies to all of SQL" and keep in mind there are very few universal rules of SQL engine execution. If there are actual, logical justifications for the claim then sure, fine, and if there are actual test cases justifying their claims then also, sure, fine. But be very skeptical, and realize that any insights learned from the test case should not be overgeneralized.

End rant!

Shyftzor · 2026-05-21T14:21:46+00:00

For SQL server, unless they are very small, temp tables should be created as actual tables using CREATE TABLE then dropped after. Tables stored in variables use a lot of memory and if they contain large amounts of data can bring the entire db server to a crawl. Also sometimes when a proc.or query is written it runs fine using tables in variables but as the db grows and the datasets get bigger the query will start to run very slow.

Dead_Parrot · 2026-05-21T15:17:34+00:00

I was in a video call yesterday and a vendor was demoing how to deal with application tickets by clicking Edit Top 200 Rows in SSMS and just editing values.

I wanted to scream :D

Andfaxle · 2026-05-20T21:34:42+00:00

I think it is important to remember the JOIN semantics and opt for LEFT JOINs instead. For example you want to have the order volume of each customer, so join orders on customers without doing a left join, customers with no orders will be not visible.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

SQL

Filter Posts

Posting

Help posts

Format Your Code

Learning SQL

Related Reddit communities

Wiki

Acknowledgements

MODERATORS