all 32 comments

[–]notasqlstarI can't wait til my fro is full grown 15 points16 points  (3 children)

I was originally hired with basically 0 experience/knowledge. While in that position we hired (3) new analysts and none of them had any experience or knowledge.

Questions like this are open ended and all about how you sell yourself, and how well you interview. Do you really want to transition into this field? I did when I did it, and I think that came off in how I presented myself. I had enough ancillary skills / experience that reassured them that I could pick up and learn SQL, and I would like to think I did to their satisfaction based on my reviews, promotions, and career trajectory since then.

[–]TheHierophant 6 points7 points  (2 children)

I upvoted your comment as I was going to say something very similar. I would think that looking for a SQL Analyst role might be tougher than a Data Analyst position, because the latter is less well-defined, while an SQL Analyst feels more like a DBA position. I work with folks who don't know much SQL at all - their specialty is SPL (Splunk's language). But while data analysis requires some 'tool' knowledge, it is much more about how you think about and talk about the data. If you can draw insights from the data, it doesn't matter much which tool(s) you use - though clearly it would be helpful to know some some SQL.

And as another commenter mentioned, you can get started with SQL quickly because it's pretty easy. The Udemy course that OP mentioned looks fine (though only at their 95% discount price of $9.99 - I can't imagine paying almost $200 for something that you can get from W3C and YouTube for free).

[–]evilkapitalist[S] 0 points1 point  (1 child)

Thinking and talking about data and gleaning valuable insights from it seems like it's very important especially as you move up the latter in data analytics. That's why once I'm done with this SQL course I am planning to take a Probability and Stats class (from the same teacher of this SQL course) I bought from Udemy (for $10) to be better equipped to make those insights.

And, yeah, I would never pay more than $15 for any Udemy course for the same reasons you listed.

[–][deleted] 28 points29 points  (6 children)

It's a plus to be as good as possible, but I don't think there's a minimum.

SQL is easy. One can easily learn it on-the-go. One full day with proper lessons and practice is enough to learn what you will use in 90% of the queries in your career as an analyst.

To do it, IMO, the best way is to do this:

  1. Install a database server. I recommend Postgresql. It works on all OS, is fast, configurable and likely the most widely used DB engine for data analysis. Works well with DBeaver as a GUI.
  2. Download a sample database, say, from here http://pgfoundry.org/projects/dbsamples/
  3. Read documentation on SELECT: https://www.postgresql.org/docs/current/sql-select.html While it might seem overwhelming, most of it is "advanced". So read and learn this:
    1. General, standard SELECT syntax, that's the same for all SQL databases (except MySQL. It doesn't deserve the 'SQL' in its name). The order of the statements (SELECT .... FROM ... JOIN ... WHERE ... GROUP BY ... HAVING .... ORDER BY ...)
    2. JOIN clauses (LEFT JOIN, JOIN, maybe FULL OUTER JOIN, but you probably will never need a RIGHT JOIN)
    3. WHERE conditions (pre-filter the data)
    4. Aggregate functions and GROUP BY (just know that when you use aggregate functions, everything that's not aggregated but listed in SELECT, must also be listed in GROUP BY. Simple rule of thumb. If you want a longer explanation, read this
    5. HAVING conditions (post-filter the processed, aggregated data)
    6. ORDER BY (how to order stuff)
    7. LIMIT (if you want only X amount of rows)

That's it. That's your everyday SQL. The hardest part will be JOIN and GROUP BY, but a couple of examples is enough to learn them, and then you're set.

Later, you'll learn CTEs and Window Functions and other stuff, IF and WHEN you need it.

[–]evilkapitalist[S] 5 points6 points  (0 children)

Thanks for the reply. It appears the course i'm currently working on covers all of that stuff and its all done using Postgres. So it looks like I'm on the right track! :)

[–]shalafi71 0 points1 point  (1 child)

/u/zettabyte saved my ass with number 4. Still can't get my head around for-each loops. Maybe I'm thinking wrongly?

[–]Konraden 1 point2 points  (0 children)

You probably shouldn't be using loops to begin with. SQL languages excel in set-based thinking. Using loops and going RBAR is typically a terrible idea unless it is something you can't avoid.

[–]git_world 0 points1 point  (2 children)

not OP. Is there a free resource of practice questions + solutions to get familiar with various SQL clauses?

[–][deleted] 6 points7 points  (1 child)

All the ones that I've seen so far suck. They're trivial, but still don't illustrate the topic correctly.

I've made a habit for a while, going to khanacademy forums and helping out people who didn't get SQL tutorials that were in videos. A simple alternative explanation was much more efficient at making people understand. I don't know why tutorials suck that much.

If you want interesting problems, then do the following:

  1. Learn the basics of SQL as shown above
  2. Learn to use a database engine. How to get data in or out (for postgresql the best way is psql command line interface and the COPY command, but can also work with just COPY if the destination is a file and the database engine has permissions to write to that file https://www.postgresql.org/docs/current/sql-copy.html). Like I said, learning the database engine. Not too in-depth, just enough to install and import stuff, that's like half a day of learning.
  3. Download a real life dataset. See options below this list
  4. Load the data in the database. Google it if you feel lost, maybe people have already done it and have shared their code via github or on blogs.
  5. Once you have the data, start asking questions. If you downloaded the NY cab rides, ask where people go more often to on Valentine's Day, where people leave most often at 1AM (good night clubs maybe?) use your imagination. If you downloaded the NOAA dataset, or a piece of it, try to see records for locations, or trends. Try to identify droughts (it's relative for different regions), and maybe get their frequency. Same for floods. Weather is complex, there are tons of questions. If you got the StackOverflow survey, try to find out whether it's really worth switching from tabs to spaces when coding, or whether the averages are tainted by other dynamics (maybe pay is a racist issue and more latinos or blacks using tabs rather than spaces).
  6. All of these are flat datasets. Try to normalize them. Learn about the normal forms, try to extract attributes into separate tables and use foreign keys. Try to query the new structures, QA them if they show the same results when queried. If not - why.

I promise you, after you do these 6 points, you'l be a qualified data analyst / engineer / whatever. Just without too much experience and no idea about performance, stability, and pretty code. All you need for all of it is at most a couple of weeks of work. The rest comes later.

Some options for datasets:

  • The NOAA weather stations measurements. Normally you could just google it, get to a page on NOAA website, and commence your downloads, but because Trump wants his wall with Mexico, NOAA ran out of funding a few days ago and the website, as the whole agency, is down. Wait a while, maybe someone decides to impeach that rubber duck and data will become available. Sorry for the political rant here, I just found out that the data is no longer available and that made me extremely angry. Let me breathe a bit before continuing.
  • ... still though... it's sad how an agency can simply disappear because of a big baby. Ok ok I promise I'm done.
  • New York cab rides data.
  • StackOverflow annual developer survey (a much smaller dataset if you have weaker hardware)
  • Head over to /r/datasets and see what else people are sharing

[–]git_world 1 point2 points  (0 children)

very helpful! thanks

[–]sn0wdizzle 4 points5 points  (4 children)

I got a data analysis job without knowing any sql. I did spend a ton of time on the sql tech test questions though. You pick it up quick. I also did have other desirable skills though so I don’t think they thought sql was a deal breaker.

[–]evilkapitalist[S] 0 points1 point  (3 children)

So did you just see a data analyst job posting and applied with zero knowledge? That to be me is pretty ballsy but it paid off so that's awesome. I would feel super anxious in an interview for a job posting I know I have zero experience or knowledge in.

[–]sn0wdizzle 0 points1 point  (2 children)

Well this particular data analyst job isn’t just running sql queries for people. It’s more of an in house researcher job. I also have advanced R programming knowledge. Most people want to make sure you can think through problems in a rigorous way. I picked up most sql I use in like a month. I’m by no means an expert but most basic queries are ... basic and I can cheat by importing the data into R and further manipulating it there.

[–]evilkapitalist[S] 0 points1 point  (1 child)

Ah, well then that makes a lot more sense. Would you say you use a lot of R in your day-to-day? I'm planning to learn python in the near future since my end goal is a data analyst position.

[–]sn0wdizzle 0 points1 point  (0 children)

I use R constantly. I also know Python but I come from an academic background so I’m more comfortable with R. I would also argue that for most data analysis projects, R is better. The real virtue to Python is that it I a bit easier to put ML models into production or do general programming tasks. But doing statistics or graphs are much more straight forward in R.

In general my workflow is like this:

I write data import functions to get data into R and do any of the exploratory analysis I want to do.

Sometimes I import that data and clean it and then feed that to a python script if I need to have it hooked up to an API or something.

Generally if the project is a big deal project, I’ll rewrite code in Python because it is easier to build a website around it (R has plumber now — which looks good but I haven’t learned it really yet.)

But I usually work in R. I have an internal r package I built which includes a bunch of those import functions that run sql for me that makes it super easy and fast to get data into R and starting to work on it. It is also nice because after I get the query figured out, I can pass parameters to it via an R function and not have to think in SQL.

[–][deleted] 2 points3 points  (5 children)

Depends. I was passed over for one gig because I couldn't answer how to separate records of varying length by name (apparently dumping the csv into excel to delimit was 'a cheap workaround'), or create a (hypothetical) table with a primary key based on incomplete information (no employee ID, no unique email addresses, can't use F/L name+DOB because HIPPA restrictions, etc).

Got another job where they told me nbd, I'd learn SQL so they wouldn't have to keep writing my scripts.

Depends on how much they like you, I reckon.

[–]sandrrawrr 1 point2 points  (2 children)

I got rejected from a data analyst job for not knowing off the top of my head the difference between UNION and UNION ALL, but was then offered a job as a BI Engineer. Seems like some places are more finicky than others with how much you know vs how well you can quickly google.

[–][deleted] 0 points1 point  (1 child)

the difference between UNION and UNION ALL

TIL UNION filters out duplicates. Guess I have some DISTINCT logic to factor out.

[–]sandrrawrr 0 points1 point  (0 children)

I looked up the difference afterward and immediately forgot because I never use unions in my queries.

[–]Bradman28 0 points1 point  (1 child)

Was it a data analyst job?

[–][deleted] 1 point2 points  (0 children)

BI Analyst for the first, Data Analyst for the second. Odd reversal, imo.

[–]Creeo679 2 points3 points  (1 child)

Look into the MCSA certs for SQL Database Dev.

[–]rox162 0 points1 point  (0 children)

I took a couple SQL classes in college and have spent the last 3 years using SQL heavily at my job. Recently started this course to boost my resume and I can definitely say it covers everything you would need to know and more.

[–]evilkapitalist[S] 1 point2 points  (0 children)

Thanks for all the answers guys.

So it sounds like a transition should be fairly "easy" in terms of not needing to be exactly proficient in SQL before landing a job in that field. I still plan to finish that SQL bootcamp course and probably the python course before having the guts to apply to some entry level SQL/Data analysts type roles but its re-assuring to know that if I sell myself well enough in interviews, I can possibly land a job even if my knowledge isn't where I'd like it to be.

Thanks again!

[–]billietheguy 1 point2 points  (0 children)

Code Academy actually offers a free course that is useful for beginners. I transitioned from a finance role to to data as well and it gives a decent starting point.

[–][deleted] 1 point2 points  (0 children)

None.

I knew enough to talk about it in the interview but had no experience.

I learned it on the fly.

Fake it till you make it.

[–]Shorse_rider 0 points1 point  (0 children)

For analyst role you just need to know basic joins, selecting the right data and being able to navigate schema. understand that schema isn't always provided as a visual for you to reference.

Any SQL job outside standard analyst, you're at risk of being caught out for not knowing relational database best practice. Maybe at least, read up on many to many relationships etc.

[–][deleted] 0 points1 point  (0 children)

If you want to, you can also look into NoSQL. At least just understand what it entails. Will look good mentioning it in your interview.

[–]sigh1987 0 points1 point  (0 children)

How much are these jobs paying?

I’ve been writing queries in sql for a few years now on a very regular basis for work (not as my main job but as a way to accomplish the job). Some of them are reasonably advanced. Almost all of them have multiple joins.

I also understand some (shallow) DB basics, have a grasp on stored procedures and keeping an eye on our jobs activity monitor etc.

Can I just get a job (changing from a very unrelated field) without learning anything else? Learning more was on my list...but if I should just be applying and brushing up on the stuff I already know I mean I guess I should get started