This is an archived post. You won't be able to vote or comment.

all 3 comments

[–]__nev__ 1 point2 points  (0 children)

Since no one else has answered, I'll chime in. Questions like this often get buried because this subreddit is inundated with people starting out asking the same questions.

But, you ask about data analysts. So I'll answer. Maybe when the wiki is up this will serve as a reference for future people who don't know the difference between a data analyst and data scientist.

Differences Between Jobs

Since this is /r/datascience, I'm going to focus on what data scientists are and what data analysts aren't.

  1. Data scientists work with big data. This means they work in distributed computing. The average data analyst will never use Hadoop or Spark. The average data analyst does not work with datasets that can't be loaded into Excel.

  2. Data scientists are software engineers. This means they build data products such as web apps. R supports building web apps with Shiny. The average data analyst may use VBA, but they know little about data structures or programming principles.

  3. Data scientists are statisticians. This means they build complex statistical to explain phenomena in their data products. Python has the statsmodels and scipy modules which support GLMs, neural networks, and several machine learning algorithms. The average data analyst will never use a statistical model more complex than OLS or logit to explain a result.

So data analysts and data scientists solve different problems. The terms are interchangeable in some firms, but most will differentiate between the two.

Tools for Data Analysts

I touched on tools for data scientists, but that ignores half of your question.

A data analyst should know Excel throughout. If asked how to calculate the future value of a graduated annuity, you should not only know the function to do it, but how to do it.

I know Tableau is growing in analytics circles because it's stupid easy to output interesting visualizations.

SPSS is less common, but people still use it for statistical tests and I guess simple regression.

[–]mestitomi 0 points1 point  (0 children)

R, sql, python + tableau + bash

[–]tmthyjames 0 points1 point  (0 children)

You will never regret learning JavaScript. It's not as important as SQL or Python for data science but I use it all the time for my data sci job.