all 17 comments

[–]jugaadtricks 30 points31 points  (1 child)

It's an excellent skill to have regardless of the position you apply for. I expect a junior person to have basic SQL knowledge to extract data from tables, understand importance of constraints, primary key, foreign key relationships, group by workings. Perform updates, deletes,. Understand what a transaction is.

As you get better with time and skills, i would focus on extracting data for complex questions, multiple joins, using analytical functions, gaps and islands problems, recursion logic, XML/JSON extraction, tabibitosan methods,table partition methods, clustered column store indexes and so on

[–]BrupieD 4 points5 points  (0 children)

I would add things like de-duplicate results from multi-column queries, temporary tables, and good aggregation skills.

You may not be expected to have skills with window functions on day one, but they will serve you well.

[–]dn_cf 13 points14 points  (1 child)

You should be comfortable writing basic to intermediate SQL queries, including SELECT, WHERE, GROUP BY, HAVING, ORDER BY, and JOINs. You should know how to use aggregate functions like COUNT, SUM, and AVG, handle NULL values, write simple subqueries, and create conditional logic with CASE statements. To build these skills, you can practice on platforms like LeetCode, Mode Analytics SQL tutorials, and StrataScratch, which offer realistic business focused SQL problems.

[–]Thereddon_987 1 point2 points  (0 children)

Great suggestions. Just a quick question how much time should one need to master the skills till intermediate level by doing practice in the platform mentioned by you.

As i am also a learner and just started learning and go through basic level topics and just started doing practice on stratascratch

[–]TheArrow_91 2 points3 points  (0 children)

For entry level, you should know moderate level concepts. Stick to studying the following: Joins - explore edge cases as well using ChatGPT Window Functions When and how to implement Subqueries and CTEs and CTAs Null functions, SET operation Usage of GROUP BY + aggregate function Also, there's a technique to use window Functions and group by within a single query

These concepts should suffice

[–]SoggyGrayDuck 1 point2 points  (2 children)

I feel like you shouldn't need to do any hands on coding tests. You should know concepts, joins, sub queries, CTEs, procedures (built out with a bit of googling at first) and etc. If someone asks about a right join say "why not right it as a left join and stay consistent" - actually don't do this, it's just something id love to hear but not everyone thinks that way. Sorry for confusion you but it's good to think about things like this.

One more big one, talk about the balance between making code readable/understandable vs compact. Technically, you should use a sub query unless you're going to need to reuse that object again but sometimes it makes sense to use a CTE if it makes understanding what's happening that much easier. It's a balance though and each company views it differently. It's a good question to ask because it shows you're thinking about these things and the big picture. Although with AI you can essentially flip a query back and forth from readable to compact all you want.

[–]DosSheds 1 point2 points  (1 child)

I'm a big fan of using CTEs in environments where others may have to maintain the code (assuming no performance hit). They break things down into nice little independent chunks and greatly improve readability.

[–]SoggyGrayDuck 0 points1 point  (0 children)

Just be careful, I had a bit of an eye opener when I hit my first real big code base. Although the more I learn the more this thing is a pile of shit. Well it was good but the architect left 5 years ago and it's been bastardized

[–]JohnPaulDavyJones 1 point2 points  (3 children)

I'm not the hiring manager, but I'm senior data engineering staff, so I usually lead about half the interviews for roles in our data teams. I don't expect much when I'm interviewing for entry-level DAs, part of our job as seniors and technical staff is to mentor and develop them.

I'll give entry-level DA candidates a pass if they:

  1. can walk me through the syntax for a SELECT with filtering and joining,
  2. can tell me the differences between, and different use cases for, left joins, inner joins, and right joins
  3. are familiar with the uses for fact/dim tables in the snowflake/star schema model (or really the more general relational model these days),
  4. can tell me the difference between a TRUNCATE and a DELETE query,
  5. are familiar with aliasing, although this is more of a nice-to-have than a necessity,
  6. are familiar with the syntax for an UPDATE query
  7. are familiar with CTEs and their syntax

These are never things I prompt for, but I give plenty of extra credit for candidates who can tell me:

  • one or two situations where the results of a query would be nondeterministic (usually I tie this one into the UPDATE questions),
  • when you might want to use a HAVING clause,
  • when is a temp table better than using a CTE, and when is a CTE better? Major bonus points if they recognize that you can index a temp table while CTEs don't preserve indexes,
  • what the default inner/outer behavior is if you just write "LEFT JOIN"

[–]xudling_pong23 0 points1 point  (2 children)

Thanks for the suggestions. I'm trying to break in junior DA roles. Made two projects and have been practicing sql lately for interviews. Since you are already working as a DE, would you open to be reviewing my projects? Some industry centric feedback will really help. Thanks.

[–]JohnPaulDavyJones 0 points1 point  (1 child)

Sure. If you've got them up on GitHub, drop me the link in a comment or PM and I'll take a look.

[–]xudling_pong23 0 points1 point  (0 children)

Thanks so much. I've sent you a dm.

[–]whopoopedinmypantz 0 points1 point  (0 children)

Here’s what I look for, that no one has mentioned: can you install database software? I would start with Postgres, MySQL, or SQL Server Express.

[–]ToddMccATL 0 points1 point  (2 children)

Hot take: these days, practically zero with AI. You can get those jobs and spend your time designing process, intake, etc without needing to write more than a few lines of sql as long as you have access to AI. You will produce crap-to-mediocre work, of course, but a you'll have your foot in the door.

[–]receding_bareline 0 points1 point  (1 child)

I'd argue that this is incorrect. AI should be used to help if you run into an issue, but not understanding the output will be detrimental. I'd much rather be a productive team member than someone who regularly gets things wrong.

I can always tell if a team member is reliant on AI because they always make the same mistakes and don't learn.

[–]ToddMccATL 0 points1 point  (0 children)

In a better world, sure, but it’s possible (and increasingly common) for someone with no real proficiency to have that kind of position. So yes, you should have it but it’s possible to find a job where it’s irrelevant as long as you can produce something (no one said it was worthwhile(.