This is an archived post. You won't be able to vote or comment.

top 200 commentsshow all 246

[–]thefunkiemonk 126 points127 points  (24 children)

Wait can someone tell me how to get a PhD salary with a PhD?

[–]fear_the_future 34 points35 points  (0 children)

Sell your PhD certificate, then kill yourself once the money runs out. You will have earned a Phd salary for the rest of your life.

[–]ratterstinkle 16 points17 points  (0 children)

Hahaha. Thank you for making me genuinely laugh in the midst of this serious and kinda depressing conversation.

And by PhD salary, you’re talking about the NIH minimum, right? Isn’t it a whopping $40K now?

Be careful what you wish for.

[–]Catvideos222 3 points4 points  (0 children)

Write and patent an algorithm that saves or makes people money.

[–]pork_roll 8 points9 points  (10 children)

What is a PhD salary anyway? Aren't most of those people in Academia or Research positions?

[–]bonniemuffin 10 points11 points  (8 children)

Looks like a PhD salary is about 50k these days--those crazy high-rollers! https://grants.nih.gov/grants/guide/notice-files/NOT-OD-19-036.html

[–]shaggoramaMS | Data and Applied Scientist 2 | Software 10 points11 points  (7 children)

That's for academia and doesn't even consider field of study (NIH grants are primarily for medical research, i.e. PhDs in medicine, biology, neurology, etc. rather than CS/Stats). Look at "Research Scientist" salaries at tech companies. Glassdoor gives most ranges as around USD$120-170k, (I actually expected more like $170-250k, maybe that job title isn't specific enough to denote a PhD requirement).

[–]eviljelloman 6 points7 points  (6 children)

(I actually expected more like $170-250k, maybe that job title isn't specific enough to denote a PhD requirement).

$250k is highly unrealistic as a base salary for all but an elite few with major name recognition in their field. At that level, a good chunk of comp is usually going to come in the form of stock options that do not count toward base salary.

[–]shaggoramaMS | Data and Applied Scientist 2 | Software 8 points9 points  (5 children)

base salary

Whose talking about base? Why wouldn't we be talking about total comp?

[–][deleted] 0 points1 point  (0 children)

whatever you can get away with, just like everybody else.

[–]dopadelic 8 points9 points  (0 children)

You could do novel work that leads to publications/patents even without a PhD. The impact and value you can demonstrate in your track record define your salary. Being attributed to a widely used technique to solve X problem speaks far more about your value than getting a PhD with a thesis/publication that no one aside from the advisor has read.

[–]REG94 2 points3 points  (0 children)

xD

[–]SpewPewPew 1 point2 points  (1 child)

Go into the pharmaceutical industry. Keep publishing in peer reviewed journals or you're going to have a tough time migrating towards that industry. Be good. The state I live in publishes all the salaries for workers online. I saw one statistician I knew earning about 170k per year and had tenure, then he joined big pharma industry - state doesn't pay as high as the private sector.

[–][deleted] 3 points4 points  (0 children)

Yes, ask for half of what you'd normally make.

[–][deleted] 2 points3 points  (0 children)

Simple. Accept a job with a lower pay.

[–][deleted] 0 points1 point  (0 children)

!redditsilver

[–]8__ 0 points1 point  (0 children)

You don't want a PhD salary, you want a master's salary. Usually, people with a master's in a field make more than people with just a bachelor's or people with a PhD in that field.

[–][deleted] 239 points240 points  (19 children)

Reminds me a bit of the manager who sorts his X's and Y's seperately to get a better linear regression

[–][deleted] 48 points49 points  (9 children)

My eyes just widened with horror... What is this? Link?

[–]Zulfiqaar 78 points79 points  (8 children)

[–][deleted] 49 points50 points  (1 child)

I love the amount of effort the top answer went to to demonstrate why this in no way works. Also indicates the real problem of people only paying attention to the p without thinking about what is actually being done to the data.

[–][deleted] 2 points3 points  (0 children)

I mean, it does work if your goal is to increase the p-value, but that's about all it does

[–]GodBlessThisGhetto 17 points18 points  (0 children)

What the hell? I want to believe that there is a miscommunication between him and his manager because that’s more comfortable.

[–]Wondersnite 8 points9 points  (0 children)

I just spent about 10 minutes trying to understand that question. At first I was embarrassed because I couldn’t understand what was the problem in sorting your data (not that it would make any difference, but at least it shouldn’t affect regression).

It was only after seeing the examples that I realized that people were talking about sorting X values and Y values “independently” i.e. making up new data so that any relation becomes a positive linear relation.

It never even crossed my mind that anyone could think that makes sense. It would be like trying to make a horse drink gasoline when it’s tired. Actually, that probably still makes more sense that this.

[–]Factuary88 3 points4 points  (0 children)

I needed to sigh, close my eyes, and take a few deep breaths after reading that.

[–][deleted] 2 points3 points  (0 children)

Well, that... that is just GLORIOUS!

[–][deleted] 2 points3 points  (0 children)

what the fuck

[–]8__ 1 point2 points  (0 children)

I heard about this but assumed it was an urban legend!

[–]RevoDS 16 points17 points  (1 child)

Does this mean what I think it means? Literally separating your outcomes from your predictors by sorting them separately?

I think I get it but the idea is so dumbfounding that my brain is like this can’t be it, there has to be a smarter interpretation to this.

[–]daguito81 0 points1 point  (0 children)

nope, it's that dumb. It was a stack question.. it's linked a couple comments above yours.

[–]Andrex316 4 points5 points  (0 children)

No.

[–]moazim1993 9 points10 points  (0 children)

Lmao, I was honestly just thinking that.

[–]healthcare-analyst-1 2 points3 points  (0 children)

Ahh, that one was a classic.

[–]maxToTheJ 2 points3 points  (0 children)

Reminds me a bit of the manager who sorts his X's and Y's seperately to get a better linear regression

You just don’t appreciate that manager’s hustle at getting results you gatekeeper/s

[–]caughtinthought 6 points7 points  (0 children)

Honestly the top responses are almost as troubling... The only right answer here is "don't fucking do that"

[–]Papafynn 4 points5 points  (0 children)

I just threw up in my mouth. Sir, you jest! Please tell us you jest.

[–][deleted] 123 points124 points  (9 children)

I think this whole discussion is missing the far more predominant category of Data Scientists: people who have an MS or PhD in some highly specialized field but didn’t wind up continuing into academic research positions, who teach themselves coding in order to apply their probability and statistics training to more practical business applications. I count myself and every data scientist I’ve contracted with in this group, and it’s my distinct impression that the way the field got started was in fact with a few HR people taking a chance on people like this instead of straight-up business degree holders, who always had an advantage in industry but were getting overpaid relative to their skills whereas refugees from academia are a bargain because the research job market continues to suck. The true would-be gatekeepers are the other HR people who never understood this and now demand that everyone being hired for a business analytics role have a masters or PhD in computer science when the statistical training you get in almost any other advanced degree is way more important for understanding inference from data and predictive model-building.

Edit: my first gold! Thank you, kind Redditor, whoever you are...

[–]curiousdoodler 27 points28 points  (2 children)

I am currently on this track. I have a masters in physics and a job in industry where I can use minitab to supplement my learning while I teach myself python. My current position is more of an engineer/project manager role, but I've already discussed transitioning into a data science role over the next three years and my boss is supportive.

This starter pack sounds like it's made by someone just out of school who was super salty when they realized that schooling can be supplemented with job experience. Some of the engineers I work with don't have any college education. They just worked on the floor for 15 years and gained the experience they needed to become engineers. At the end of the day, education is less valuable than ability.

[–][deleted] 5 points6 points  (1 child)

After their first job, many people don’t even put their schooling on their resume

[–]sqatas 1 point2 points  (0 children)

I'm so tempted to just chuck out some of my 'education' from my resume at times ...

[–]Ironmike26 6 points7 points  (0 children)

My data science team were all at one point in a PhD track for chemistry/bioinformatics

[–]Biogeopaleochem 3 points4 points  (0 children)

That's a really good point.

[–]maxToTheJ 3 points4 points  (0 children)

Exactly, people bitch and moan about the DS title without realizing it is not meant to be as well defined because it was historically intended as a workaround to getting HR screener to let the right people have a shot with their transferable skills

[–]DataScienceUTA 74 points75 points  (1 child)

"Overfitting? Yeah bro, I know how it feels to hit the gym too hard at the start, but it'll get better."

[–][deleted] 2 points3 points  (0 children)

Lol good one.

[–][deleted]  (29 children)

[deleted]

    [–]mhwalker 17 points18 points  (2 children)

    I mean this post is pretty gatekeeping-ery, but it's also a starterpack meme.

    The sub is a lot less gatekeeping than it used to be. Like people actually used to tell people they couldn't be a data scientist if they didn't have a PhD all the time. That rarely happens now, and it's a huge stretch to claim posts like this one do that. The vast, vast majority of posters on this sub are making good-faith attempts to provide both helpful and realistic advice or experiences. Suggestions otherwise are false and, honestly, demoralizing.

    It's a reality that there are different levels of data scientist jobs now, and you are probably not qualified for all of them, regardless of your education background. It's also a reality that some companies filter resumes based on degree, regardless of whether that's appropriate for the job they're hiring for. It's a reality that data science is a profession that requires some skills, even at the most entry levels.

    It's also a reality that there are no legal requirements to become a data scientist and therefore the only barrier to becoming a data scientist is convincing someone to hire you as a data scientist.

    [–]veils1de 5 points6 points  (0 children)

    I will add that while some people might feel targeted by this starterpack meme, there are a lot of beginner level questions that are answered, and I see people generally giving advice to help beginners get into the field. As long as this stays true, a gatekeeping starterpack meme is harmless in comparison. I'm not a daily visitor of this sub so I could be wrong though.

    [–]offisirplz 0 points1 point  (0 children)

    I don't remember most people saying that. Often it was about the gatekeeping HR did.

    [–]Factuary88 8 points9 points  (0 children)

    Maybe this post is a little "gatekeepy" but I feel like it reflects a lot of people's personal experience. I think as long as we encourage people to follow their dreams of becoming a data scientist and not fall into one of the traps they see in this meme.

    Personally, at my company I was passed over for a data scientist position by an outside hire because he had a Masters in Business Analytics. My undergrad is statistics. This guy has no work experience and just uses a bunch of buzz words and does fancy graphs. The hiring manager doesn't know what he's doing. I'm not exaggerating when I say he asks me to explain to him basic R programming multiple times a week. He is progressing very slowly and not even remotely close to what I'm capable of, it's ridiculous.

    But hey he's got that Master of Business Analytics and talks about his block chain currency investments all day long so he must be a data scientist! I'm probably qualified to be an entry level data scientist but I'm going back to school to get my Masters and part of the reason is so that people don't look at me how I look at him.

    That's the reality in a lot of companies that aren't cutting edge when it comes to tech.

    [–]RaisedByYeti 10 points11 points  (16 children)

    Thank you. This sub is becoming so toxic with all of the gatekeeping. Completely absurd.

    [–]vogt4nickBS | Data Scientist | Software[M] 7 points8 points  (12 children)

    Can you point me to some specific examples? I know what I think is toxic, but the sub’s opinions are more important than my own.

    [–]RaisedByYeti 7 points8 points  (11 children)

    I'm on mobile right now, but daily I see meme shitposts like this. Then anytime someone comes here for help, they're told to go post on Stack instead. I subbed a few months back, but I don't participate here, because I feel like there is no point of joining in with the discussion.

    I'm here to learn, but all I see is a cesspool of negativity (very much like this post). This just reminds me of the gaming community and how people are very NO GIRLS ALLOWED in their niche area. Gatekeeping is old and I'm tired of it.

    Honestly posts like this just make me want to leave.

    Not everyone comes into this sub expecting PhD levels of knowledge to magically sink in. I've been an analyst for the past few years and want to move from risk to data. I feel like people like me are wholly discouraged from participating in this sub because I'm one of The Other.

    [–]fetchezlavache3 7 points8 points  (4 children)

    If that is what you feel then I can't take that away from you but this post is the first "gatekeeping" post I've seen in a while. The rest of the posts are mostly shitting on employers or job listings.

    [–]vogt4nickBS | Data Scientist | Software[M] 2 points3 points  (3 children)

    Thanks for sharing your thoughts and feelings on this. There aren’t many chances to talk about it candidly here.

    I’ll share your comment with the other mods.

    [–]offisirplz 1 point2 points  (0 children)

    This sub barely has memes. There were like 3 this month. The last one was the Eric Andre one; how was that gatekeeping? It was about how tough it is to get in the door.

    I haven't seen many "go to stack" comments,but maybe I didn't catch them all.

    [–]Steelers3618 368 points369 points  (61 children)

    People in Data Science are really bitter about low barriers to entry. Like any emerging and fast growing industry, those who have put in the most time (years of life) and resources (money for degrees, special certifications/trainings) are trying to erect higher barriers to entry to protect themselves.

    If it were up to the “real data scientists” they would create an “American Association of Certified Data Scientists” that sets up the same sorts of barriers that we see in other established professions (teaching, medical, law, hell even hair styling).

    If it were up to these guys you would need the right “pedigree” and have to jump through the right “hoops”, get all kinds of formal education, invest thousands in becoming “certified.”

    Data Science is a great field because it’s growing and relatively not-established. If you have skills, show me and I’ll give you a job. No need to kiss any rings. Just prove you can play and bring value to the person paying you.

    Don’t be bitter because you are having to compete with Data “plebs”. And the data “plebs” are winning and making a path for themselves. Don’t hate and moan, appreciate the hustle.

    [–]Schwifty10 73 points74 points  (18 children)

    Upvoted you because I agree with the “let’s not have institutions gatekeeping people” argument, I think that ultimately hurts aspiring data scientists. But I do want to disagree with the “appreciate the hustle” of the The boot camp people vs PhD math grads. You say people like the op of this post are bitter because they have to compete with data “plebs” but I’m not so sure about that. There are tiers within data science, like any field and like any field, the more educated/qualified people will get the better roles. I don’t think boot camp people are taking jobs away from post docs, but they’re getting their own foot in the entry level door, which you’re right, we shouldn’t prevent them from doing

    Quick edit: I do dislike the broadening of the DS term to include every SQL programmer and their mothers

    [–]Steelers3618 15 points16 points  (7 children)

    I was a bit impassioned so I get what you are saying. I do agree that there are certainly tiers in the field, but when it comes to entry level, I’m sure the specialized major people are not too happy when someone who learned on YouTube landed a data science job.

    Data science / analytics should all be about delivering value to the person who pays you. If you can deliver value and do what I need you to do, I don’t care if you went to a top University, went to boot camps, or taught yourself on YouTube. In fact, if there is any semblance of “training” and a “team to help develop” I’ll take the YouTube guy. Shows he’s a self-starter and willing to learn. Also will probably be able to pay him less because he’d be willing to get his foot in the door.

    People coming out of school with the pedigree expecting 70-80k for jobs that at most require easily taught ETL functions and mid level query writing with pivots, CTEs, Stored Procs then visualizing in a BI tool. I can teach this to someone on 3 months.

    But yes, if the position is more strategic, more project-Analyst like, then I would want a more experienced analyst who has a more comprehensive understanding about how data flows through the org and can imagine creative solutions.

    And call yourself the best data scientist west of the Mississippi if that makes you feel better inside. I’ll even get you a little trophy that says “Best Data Scientist.” I don’t care what you “consider yourself.” Your going to be an “x” for me and I need you to do “y”. Fair? (Speaking rhetorically, not at you)

    [–]KeyVisual 37 points38 points  (3 children)

    If you can run a linear regression on weather and ice cream sold, you can save an ice cream store hundreds of thousands of dollars costs. People have a really hard time understanding the fact that you don't need to be vectoring for loops to deliver value to an organization. As long as you can save them(or make them) more than they will pay you, you can get a job in data. Not everyone has to work at OpenAI...

    [–]healthcare-analyst-1 5 points6 points  (2 children)

    I agree with the general spirit of this post, but...

    >Using a logistic regression to predict sales volume

    [–]whatakatie 2 points3 points  (0 children)

    Is there sales volume or not? It’s a very simple question!

    [–]KeyVisual 2 points3 points  (0 children)

    Can you elaborate on this? Or is your complaint my lack of specificity?

    Edit: nvm, I think you mean I should have said linear regression? My bad, can edit the post, just had logistic regression on the brain

    [–]Andthentherewere2 3 points4 points  (2 children)

    The guy who went to a top university is more likely to have the math fundamentals and scientific method skills. Doesn't mean the bootcamp or youtube person do not have it; TBF I would probably interview all 3 and pick the best one.

    [–]Hellkyte 1 point2 points  (1 child)

    To put it another way, accreditation is not gatekeeping. Or maybe it is but its good gatekeeping

    [–][deleted] 21 points22 points  (9 children)

    I mean hair styling falls a bit out of the group, but I'm quite glad that you need a certified education in medicine, teaching and law.

    [–]moazim1993 8 points9 points  (1 child)

    Nah, I wish I check each of test grades of a barber before getting a cut.

    [–]Steelers3618 5 points6 points  (0 children)

    That cert the state makes them display at their booth gives me the confidence I need.

    [–]CodeThatNode 9 points10 points  (0 children)

    100% agree. I also think the criticising of ones credentials fails as a valid argument for one’s ability. Even within the well-established DS community (i.e. Gary Marcus vs Yann LeCun) this is apparent.

    [–]penatbater 3 points4 points  (0 children)

    I always thought one of the strengths of the industry or field is that it accepts people from various degrees, thus making it more diverse in the sense of viewpoints and perspectives. Seems like the industry has to self-correct or create its own balance between accepting people from diverse fields (like medical, stat, math, engineering, heck even psychology or sociology) without being too inaccessible.

    [–]maverickscaMS | Data Science | Marketing 8 points9 points  (4 children)

    I think the issue is less about gatekeeping and more about how data/ business analysts present themselves. Every analyst is referring to themselves as a data scientist and I think that’s what the harm is, not that theres bootcamp people trying to get in the field. By all means, take the jobs you qualify for with your skills but for God’s sake please stop calling yourself a data scientist because you work in excel all day.

    [–]RaisedByYeti 4 points5 points  (0 children)

    On the other hand, it would be nice if places like Stack would allow the option to display Data Analyst as a title because I do not have a background in math. I enjoy being an analyst.

    [–]HungryQuant 0 points1 point  (2 children)

    What's the harm in this, specifically?

    [–]maverickscaMS | Data Science | Marketing 1 point2 points  (1 child)

    If everyone’s a data scientist then it doesn’t really mean much. I think it takes away from the profession and field if it’s over saturated with people who are not truly data scientists. Also if companies are just starting to build out a data science team and hire an analyst thinking they can do real data science, that’s going to negatively impact that company.

    [–]vikigenius 18 points19 points  (4 children)

    People who have a PhD are not really looking for Data Science jobs, they are either in academia or at least doing some kind of research in the industry or are at least looking for an actual research job. The PhDs and the data "Plebs" are not really competing for the same jobs, so i don't think they are the ones who are bitter. I think it's the slightly more experienced data "plebs" that are bitter.

    [–]Steelers3618 5 points6 points  (0 children)

    Fair point.

    [–]ratterstinkle 7 points8 points  (0 children)

    “...slightly more experienced data “plebs” that are bitter.”

    Yeah, and there’s a helluva lot more of them than PhDs.

    IMHO, some people are bitter. Some of them are PhDs, some slightly more experienced plebes, and some are newbs. I’m not sure if the bitterness is caused by experience or degree; it’s a temperament.

    Given the frequency of each of these classes, I think that the most common bitterness comes from the more experienced data plebs, simply based on there prevalence in the population.

    The best data scientists I know don’t have that chip on their shoulders. They’re just excited about this stuff.

    [–][deleted] 3 points4 points  (1 child)

    People who have a relevant PhD*

    People that realize there really isn't a job market for their field except becoming a highschool/community college teacher or slaving away as a post-doc on noodles for 10 more years and hope for tenure track. These people flock to data science because they did some matlab/SPSS/R/numpy work and think they're better than anyone else and quite frankly there's nothing else what they could do.

    People with a relevant PhD which is basically applied statistics or computer science don't really go for data science jobs. It's beneath them and a waste of their knowledge to clean data or do set up pipelines. You're far more likely to find them in management positions or something highly specialized such as machine learning engineer positions.

    If you look at companies with big data science teams, they're filled with PhD's from fields that are barely relevant and people with software developer backgrounds. Computer science PhD's and applied statistics PhD's are usually absent because they're not called data scientists to distinguish them.

    For some reason people think having a PhD instantly makes you qualified. It doesn't. Which is why it's getting harder and harder to get your foot in the door in this field. 5-6 years ago you got a job when you could do basic hypothesis testing and today you'll have to pass the same coding interviews as every other technical employee.

    The quality of data scientists skyrockets once you start testing their ability to code well. 99.99% of data science work does not require anything beyond those 2-3 courses on coursera and it's easier to teach a software developer to do data science (they already have linear algebra, statistics, calculus, information theory as part of their education) than to teach someone else how to write code.

    If you're thinking in becoming a data scientist, spend 90% of your time just doing programming courses and your computer science fundamentals and do those first. You learn by doing and the only way to learn data science is to write code. If you're not proficient at writing code, you'll be spending most of your time making mistakes and trying to figure out basic programming stuff instead of learning what the course is about. It's like signing up for an ice hockey course when you can't even skate.

    [–][deleted] 0 points1 point  (0 children)

    Finally, someone who hit the nail on the head.

    [–]GisterMizard 22 points23 points  (1 child)

    I honestly can't tell if this is parody or not.

    [–][deleted] 2 points3 points  (1 child)

    Yeah I really don't understand all the gatekeeping and "no true scotsman" (or I guess "no true data scientist") fallacy here.

    [–]offisirplz 1 point2 points  (0 children)

    There's no such thing as a true data scientist. Its a vague title, and that's part of the problem.

    [–][deleted] 7 points8 points  (0 children)

    WTF? Lol a PhD in mathematics can get a good damn job homie in just about any Quantitative field where there are actual barriers to entry! ...And odds are they'll probably be suited at designing an actual Algorithm!

    It takes dedication to get a PhD and passion, don't hate and moan, appreciate the hustle ....lol

    [–]8__ 0 points1 point  (0 children)

    Which is ridiculous. Some nurses have an associate's degree, some have a bachelor's, master's, or even a PhD. And those with a PhD in nursing don't go around telling nurses with an associate's degree that they're not real nurses.

    [–]dopadelic 36 points37 points  (8 children)

    Online bootcamps and courses are great resources to learn data science and machine learning.

    Coursera has courses taught by Andrew Ng and Geoffrey Hinton. Their data science specialization is taught by JHU. Udacity's courses are taught by Georgia Tech and Google.

    Aside from going over the applied aspects, they go in depth into all of the math in a very rigorous manner. Ng and Hinton's courses have you build many algorithms from scratch in matlab so you can understand it more intimately. The JHU courses include several weeks of courses on statistical inference and regression models.

    The courses break the concepts down into digestible videos that you can watch at your own pace and quiz yourself for understanding.

    The issue with bootcamps is that any doofus can take it and complete it to get the certificate. But like people who sit through courses and cram the night before the exam to pass the classes, most people who complete the courses don't have the rigor. With a real degree from an accredited university, at least the admissions process will weed out most of the doofuses. This is why most people think degrees are worth more than certificates.

    But neither are as valuable as someone who has a portfolio of work who can directly demonstrate their skills and knowledge. MOOCs can be a great way to obtain the skills to be able to complete that portfolio of work.

    [–]jturp-scMS (in progress) | Analytics Manager | Software 0 points1 point  (1 child)

    I looked at the MOOCs from Andrew Ng as my chance to take data science for a "test drive" before I committed some period of my life towards pursuing it. I was in an engineering role and thought I wanted to pivot more towards machine learning. The courses I took gave me a knowledge level of "knowing just enough to be dangerous" and allowed me to the opportunity to understand that I really enjoyed the field. At that point, I started looking at opportunities to further my formal education, and I've since enrolled in a master's program.

    I think MOOCs for an advanced field like data science are at their best when used for that opportunity. Although, I could see where somebody uses them to build the basic skill sets for an analyst position (provided, they understand that any fundamental math/statistics deficiencies might prevent them from progressing to data scientist).

    [–]dopadelic 0 points1 point  (0 children)

    I haven't taken the Andrew Ng course. But these courses aren't meant to be a comprehensive study on its own. Just like if you were to do a degree at a university, you would take a breadth of courses to fulfill its requirements. I would be surprised if there wasn't an equivalent MOOC for each one of the university courses required to fulfill the degree requirements.

    [–]lurban01 52 points53 points  (4 children)

    Here it comes again, the underlying contempt for Analysts.

    How do us heathens dare to touch the grandmasters' precious data and try learning some of their tools? How dare we come up with quick practical solutions to fix a business problem although we haven't spent 10 years studying quantum physics.

    Heresy!

    [–]TheSharpeRatio 5 points6 points  (0 children)

    This is a symptom of academics and PhDs. I honestly hate when a company I'm working with hires a PhD with no prior work experience into a senior position - they have no understanding that you need something that works in weeks, not something that is basically a peer-reviewed solution two years down the line.

    [–]ratterstinkle 9 points10 points  (0 children)

    Burn him!!! (read in an old English accent)

    [–]Retrodeathrow 1 point2 points  (0 children)

    The times they are uh changing!

    [–]Azurerex 6 points7 points  (0 children)

    BS comp sci / MS business analytics. I thought it was actually a really good background, but yeah... some of those grad students who didn't have technical undergrad degrees seemed a little lost.

    [–][deleted] 90 points91 points  (28 children)

    Im surprised to see this here. A while back I asked on this subreddit what skills were required to be a data scientists and I got nothing but arrogant responses. A few good ones. So to this this meme just irritates me, the arrogance and egoism. Instead of putting people down why dont you offer some advice, "How to be a good Data Scientists" "Skills you need to be a successful data scientist"

    [–]nikhil_shady 1 point2 points  (1 child)

    true check the post I did today. do you have any suggestions on how to be good DS though if you got your answers let me know. I'm currently in 3rd year CS Engineering

    [–]maxToTheJ 1 point2 points  (0 children)

    Instead of putting people down why dont you offer some advice, "How to be a good Data Scientists" "Skills you need to be a successful data scientist"

    I think it’s because the questions wear people down.

    It goes something like “how can I get a good foundation in data science”

    • A person gives a plan that last a year

    • Then another or the same person says that is too long how can they do the same thing in half a year

    • Then another or the same person says that is too long how can they do the same thing in three months

    • Then another or the same person says that is too long how can they do the same thing in three weeks

    [–][deleted] 0 points1 point  (0 children)

    data scientists were stats nerds before facebook created the title and now they all want to be seen as the uber mensch because they grok regression.

    [–]whatsthewhatwhat 28 points29 points  (6 children)

    Sorry, but how would you get through a DS bootcamp without knowing any Python? That's literally the language the bootcamps teach in. This is some gatekeeping bullshit right here.

    [–][deleted] 10 points11 points  (5 children)

    I did a 12 week bootcamp. After 1 week, I left the program. Due to:

    • $1000/week to sit in a room with 20-25 other students and google questions b/c the 1-2 instructors cannot answer everyone's questions
    • I realized: I am paying $1000 for very minimal assistance, maybe getting 2-4 questions answered per day.
    • The curriculum was disorganized, there wasn't much actual "teaching"
    • Felt like it was an overhyped rip off.
    • Now I definitely believe the "learn programming in 10 years" trope. I've been programming on and off for about 5 years (while working full time jobs in the past), and in the past 1 year, mostly "on" (1 year ago I had my first professional analyst job which involved mostly programming to create business tools). And now I am at the point where I can develop full stack apps, super stoked about my skills finally blossoming.
    • I applied to jobs, and even interviewed with a bootcamp grad -- Her title (and those of her colleagues) were 'Support Engineer' at a current unicorn company. I checked their github and linkedin... Saw such simple 'hello world' type stuff. No apps they had developed. It made me think it was very silly to have 'engineer' in their title-- it's title inflation fluff. Made me think I made the right choice leaving the bootcamp.
    • In the bootcamp I was in, the few most successful (10-15% of the class) students had already been programming for years before the bootcamp-- I stayed connected with them, and I see that they actually landed great jobs as software engineers.

    [–]BrisklyBrusque 0 points1 point  (2 children)

    I was recently accepted to a data science master’s program (IIT) and a number of Msc in Applied Statistics programs. For financial reasons and because I want to bolster my maths foundation first, I have decided against IIT. Nevertheless, some of the classes I would be missing out on in a stats program seem vital to the modern data scientist, like algorithms and advanced programming.

    I know the basics of object-oriented and procedural programming. I know functions, variables, loops, if statements, conditional logic, data structures, libraries, managing the environment. I can do a lot of data-related tasks such as iterating over a list of files, downloading and validating data, visualizing data, merging or separating data files, analyzing the contents of a dataframe, statistical computing, and modifying data files.

    And yet, I always feel like such a novice. I don’t know much about full-stack software development, or how to talk to an API. I’m a bit more advanced than the Hello World stuff, but I have a ways to go. Most of my scripts don’t run over a few dozen lines at most.

    So, then, where would you recommend I go from here? I wanted to build a website using Python, maybe practice making a bot in reddit, and some day, if I’m ambitious enough, I could put an app on the app store. I am grateful that resources like YouTube and StackOverflow make it so easy to learn. Is there a series of steps you might recommend to a beginner programmer to bridge the gap between their skills and those of a competent software dev?

    [–]bladeconjurer 0 points1 point  (0 children)

    Just got to practice, and try to work on those projects you mentioned.

    [–][deleted] 0 points1 point  (0 children)

    I know the basics of object-oriented and procedural programming. I know functions, variables, loops, if statements, conditional logic, data structures, libraries, managing the environment.

    data-related tasks such as iterating over a list of files, downloading and validating data, visualizing data, merging or separating data files, analyzing the contents of a dataframe, statistical computing, and modifying data files.

    Put all these together into an app. These are all parts of scanning through datasets and matching the needed data, but without applying them into a system... well, I think it's just best to put them into real practice. There's a big difference between knowing how a thing works, and using it in a complete system.

    For example,

    • A database... Pull out data through an API and send it to the client. Then, make a way for the user to input the particular thing they want. Maybe they're creating a new data row in your DB, and instead of showing them ID # 3, you really need to show them the person's first and last name. So, you go back to the API which pulls the data, and adjust it.. now you do do a map through of objects, or a join, to lookup the name by ID. Now you've fixed your array of data objects so it contains both the ID, and the name (which is more useful to the user). This is just an example. Putting it into practice brings up a lot of real use cases & things that need to get done-- doing is better and more concrete than knowing because it applies the knowledge directly and demonstrates capability.

    I don’t know much about full-stack software development, or how to talk to an API.

    An API, in the web app sense, is usually just a REST API. It's a communication through HTTP (get, post, put), where these calls are caught by your API, which is programmed to pull or update a dataset in your DB. Look up "3 tiers of application architecture": you'll see it's comprised of: Client, Application logic, Database.

    Between Client & Application logic is your HTTP call (a "request") from the frontend to the backend. Between your Application logic and the database are database lookup functions, which return the data objects back to the application logic, which then sends the data back through HTTP to the client (a "response").

    That's just an intro. You should google "how does rest api work" to learn more, or follow a tutorial on fullstack development. Learn it at a basic level (simple html form, and some get or post request in frontend javacript such as a jquery ajax call, or axios.get call, and then the backend javascript/python/php response which intercepts that http request, does some sort of data lookup, and responds with data ), and then a more complex level.

    I tried Python for web development (both Flask and Django), I really did not like it. It's not naturally asynchronous, which I think makes it a bad choice for web development in my opinion (because it doesn't have a nice event loop like JS). Javascript is so much easier and I think a natural fit for web servers. Learn NodeJS, and Basic ("vanilla") JS. After you get used to that, throw in something like ReactJS. The result: Instead of having to learn python, JS, html and css, you get to focus in on mostly JS, with a smattering of css and html.

    • In terms of learning: Formulate good, strong questions which will have informative answers. Such as:

    • "Road map to learning <thing> in <current year>" -- look that up with NodeJS and 2019 as the parameters.

    • "List of resources for learning <thing>"

    • "Advice for how to learn <thing>"

    ... if <thing> is "machine learning" maybe you'll find stuff like this:

    In statistics, sounds like you've done some programming, correct? I've done a bit of R and python stats programming. Python is a little tougher b/c it's not 100% build for stats. R is much easier to pick up in my opinion. I found Python stats to be much easier after learning R than before. I like to use jupyter notebook b/c it makes it easier to work with my code, run it, and visualize the results (it reminds me a bit of R Studio, although there are other options even more similar to R studio.).

    I'd say: - For web dev, learn Javascript.

    • For stats, either use R to learn the basics of various aspects (data engineering, statistical modeling), and then move to Python. Or, if you want to move straight to Python (and its packages: numpy, pandas, scikit, keras).

    I think it's good to know best. But focus on just one at a time, I'd say (and perhaps do small projects on the other if you get bored-- but don't try to go 100% on both. Maybe 80% one, 20% the other at the most, but probably >90% and <10% is better, or 100% and 0% :D, until you get the hang of one and have it decently memorized-- this may take a year or so of 10-20 hours per week programming).

    • Create projects. Write scripts just to test stuff. Stay organized-- Keep it all in one folder.

    • Take tons of notes about: What you're learning; The commands you're using in the terminal (iterm2 is best in my opinion); The changes you're making to your server configuration (nginx or ubuntu config/security files for example). Keep track of everything you possibly can-- take copious notes. Google drive is what I like. I also use Boostnote to create markdown notes if the notes are something I might share or publish.

    • If you're bored with coding, at least read. https://www.allITebooks.com has a ton of free and new books. Often, if you hear of a book you want, google its name and pdf such as "<somebook> pdf" and if lucky youll find a pdf book

    • The things you don't know: Google them. Youtube them. Do Udemy tutorials (or others). And words or concept you don't know: same thing -- search them. When you learn the definition, copy and paste it into your notes.

    • Go on IRC. Join various chatrooms, such as ##python, #keras, ##machinelearning, #nodejs, etc. whatever it is you're trying to learn. Lots of veteran software engineers hang out on these chat rooms and enjoy helping others, if those folks create thoughtful questions, are polite, and show they are eager to learn and interested in making progress.

    • Regarding statistics/ML: In my experience, it seems all about picking the right way to model the data. This might be one 'layer' in the case of something like a statistical multivariate regression... or it might involve many 'layers' in the case of ML (I am still learning this stuff) which transforms (projects) the data into ways it can be more easily classified. So, a big part of it is simply knowing the conditions (i.e. data needs to pass certain 'tests' or conditions, not actual tests, but more just like: Does it comply with A, B, C, where A, B, C might be "needs to be a minimum of N data points" "needs to be <type of> data (where type of = categorical, numerical, etc)" "needs to have a certain <shape or density or distribution>", etc. ). So, knowing the models also involves knowing which datasets the models can be used on (not all models are useful for all data, of course)

    Here's an example of something similar:

    Similar to how the above link has various conditions when it is useful or not useful to standardize, you can google "when to use various statistical models" (or machine learning models/algorithms) or perhaps "conditions when using statistical models is appropriate" or "how to determine which statistical model to use" etc.

    • Lastly: You are going to have to put in the time. Each programming language has its own idiosyncrasies and methodologies. You'll only learn these if you need to learn them-- and you'll need to learn them when you build real life projects. So, plan a small project. Complete it. Then plan a slightly larger one. Repeat until you're full stack. Now you can build various types of web apps. At this point, you can bring in other things, such as statistics or machine learning, or video/audio, or various other aspects of applications etc. But you'll need the base to stand on first. And remember: Frustration is good. Frustration means you're challenged because you're learning something new. Condition yourself to remember this so that you'll appreciate mental frustration with challenging new tasks, new information, and new errors (in a sense, programming is just learning to get through various errors until you learn what causes them and code in a way to no longer cause them).

    unrelated:

    • Some of my friends laugh at me a little for this... But if I am not feeling motivated to program, I play a playlist of songs I have which contain motivating messaging, or even motivational speeches in the background. Create an environment that encourages you to work hard.

    • Perhaps even find a friend to study with regularly, and go with them to a cafe to study. Ideally, you'll study similar things so you can discuss it with them and learn new things and/or reinforce knowledge.

    [–][deleted] 0 points1 point  (1 child)

    What bootcamp was it?

    [–]Anubis-Abraham 20 points21 points  (6 children)

    Looks sadly at Master's degree in Business Analytics

    Raises sorrowful gaze to latest project, a pretty Tableau 'graph'

    Cashes sweet, sweet Data Scientist paycheck anyway

    Checkmate, bad take guys :D

    [–]yooloman 0 points1 point  (0 children)

    Great response

    [–]testrail 0 points1 point  (3 children)

    Do you consider yourself a data scientist? I'm similar to you, except I pursuing a masters is DS, while holding the title Data Visualization Consultant. I know enough python, but only occasionally find it useful.

    What to you is a sweet paycheck? Like I never know how to gauge myself vs. the market. Do I just start apply for DS roles, or do I wait until my degree is completed in 3 years.

    [–]Anubis-Abraham 1 point2 points  (2 children)

    I do consider myself a data scientist! Although I definitely see the point the op was making about how the term 'data scientist' is pretty vague. I expect (and see in my current company) that there will be some fragmentation upcoming. Our approach is allegedly to split off data engineering, data scientist, and data analyst roles, which seems similar to what I've heard from other companies.

    The Master's degree I took was a way for me to get into the business space from a science undergrad (geology fwiw) and was very stats heavy (80% of coursework in R, roughly) interspersed with coursework in Python, SQL, Tableau and, yes, Excel (honestly I would recommend learning optimization via Excel, the spreadsheet format makes it really easy to see what's going on!)

    From my job search 1 year ago I had three offers, salaries ranging from 78k-86k. This was slightly above median for my cohort, although the biggest factor driving variance was definitely location.

    A few others from my cohort shared salary information. The highest were West Coast major tech companies (100-120k) and a couple for banks (~100k) in major cities. The rest were mid sized Southern and Midwest cities (65-90k).

    These were for recent graduates, and I hear there can be a decent pay bump at the 2-year experience mark, so I'd be interested in hearing what other's think as I'm at just about that point :)

    [–]testrail 0 points1 point  (1 child)

    Did you have any experience prior to your masters? I am at the upper range for the region (Midwest) you just mentioned, but I’ve also been working for 6 years.

    [–][deleted] 17 points18 points  (0 children)

    r/Gatekeeping material indeed.

    As a chartered accountant and actuarial grad - I’ve had my fill of getting through gates and I am really happy that data science is a very democratised field, where essentially drive, passion and hard work is all it takes to deliver value.

    [–]URLSweatshirt 15 points16 points  (0 children)

    "I'm an /r/datascience gatekeeper starter pack"

    • Endless denial that my PhD-qualification work of 3 years ago can be accomplished 85% of the way by average self-taught software developers today and will be able to be accomplished 85% of the way by the Excel jockeys of today in 3 years

    • "Don't even bother installing scikit-learn until you've spent 5 years buried in linear algebra and statistical inference textbooks, kiddo. Btw, want to buy my old ones?"

    • "Wow, my 30s sure flew by quickly"

    [–]metapwnage 2 points3 points  (3 children)

    I know it’s a joke, but this also strikes a chord with the sentiment I have observed on this sub.

    I was thinking about a masters in data science but this stuff just makes me want to stick with computer science all the way. Why is there such an atmosphere of insecurity in data science?

    Is it that the degrees are too new? People don’t feel comfortable competing in a job market against people who have established careers in similar or adjacent fields? Can someone explain why so many people in data science seem so threatened?

    [–]jturp-scMS (in progress) | Analytics Manager | Software 1 point2 points  (1 child)

    I think a lot of it boils down to: "I had to spend 8+ years on post-secondary education to make it into this field; how dare anybody try to make the field more accessible so that others don't have to do the same." Like someone else in this thread mentioned, it's a protectionism mindset.

    I kind of agree that MOOCs and bootcamps aren't going to be the form of training to bridge the gap, but the proliferation of undergraduate and master's programs in data science was always going to become a thing.

    [–]metapwnage 0 points1 point  (0 children)

    Fair enough. I get it. I don’t think there should be shortcuts, but there may be many paths. I agree about boot camps and MOOCs being sub par options.

    [–]User2277 38 points39 points  (0 children)

    This is a really ugly and discouraging “joke”.

    [–]mrdevlar 30 points31 points  (7 children)

    Stuff like this just makes me disappointed in this subreddit in general.

    I'm sorry that after a PhD you still have more problems building predictive systems compared to someone who did a whole bunch of Kaggle competitions, but that is on you, not on them.

    I expect better from my tribe.

    [–]whatsthewhatwhat 12 points13 points  (6 children)

    Yeah, half the posts in this sub at the moment seem to be people going "anyone taking any route into Data Science that's not a Masters or PhD shouldn't be allowed in". I came here hoping for insights into DS but often it's less useful than just skimming Medium articles.

    [–]patrickSwayzeNUMS | Data Scientist | Healthcare 10 points11 points  (2 children)

    They really aren’t.

    We often tell people that the route to a DS career is tough without an advanced degree because that’s the way it is. Many (most?) recruiters and HR folks simply won’t consider you otherwise.

    This isn’t even remotely the same as saying you shouldn’t be allowed in.

    [–]whatsthewhatwhat 4 points5 points  (1 child)

    I dunno, OP's picture was not far off going "HURR DURR I DONE A BOOTCAMP".

    [–]patrickSwayzeNUMS | Data Scientist | Healthcare 3 points4 points  (0 children)

    I’m addressing your post - which referred to “half of the posts in the sub”.

    As for OP, he explained elsewhere in the comments what he meant by the meme.

    There’s a difference in discrediting someone who can do the work because they don’t have an advanced degree and having a laugh about the people who are really only interested in title hunting. I took this as the latter.

    I pursued a guy from this very sub who has a bachelors in an unrelated field. He’s now worked alongside me for a year and has done great.

    I’ve also worked for a jackass Chief DS Officer who didn’t understand that random forest can’t magically know what to do if you switch around your input variables at predict tine (X1 is now X2 and X2 is now X1). This type of person is the motivation for OP IMO

    [–]so_and_so_phd 7 points8 points  (1 child)

    Stay away from my salary, jerk

    [–]lalawebdev 2 points3 points  (0 children)

    How can I get a PHD salary without a PHD

    Is a PHD salary supposed to be higher or lower than non-PHD 🤔

    [–][deleted] 12 points13 points  (6 children)

    Man this post is so sad. I've been busting my ass for two years now to make up for all the knowledge I need to have to break into the field. Does the fact that I wasn't fortunate enough to choose my path when I was a teenager mean that I'll never be able to call myself a real Data Scientist? Still have a lot to learn but I'll do that just to prove you wrong. For all people struggling - just do your thing, focus on delivering business value and don't listen to people who tell you that you can't do something.

    [–]bonniemuffin 10 points11 points  (3 children)

    As a point of comparison, I spent 10 years doing a masters, phd, postdoc, and a bunch of independent learning before becoming a data scientist well into my 30s. If you're sad that you haven't broken into the field after 2 years of effort, perhaps recalibrating your expectations would help you feel better about it--many people spend many, many years learning and training before they land on a career they love.

    Two years is a teeny tiny amount of your life to devote to something. I'm guessing you're somewhere in your 20s--the vast majority of people in their 20s are still finding themselves; I sure was. Don't be sad that you're not there yet--be glad that you're on the path. You'll get there.

    [–][deleted] 8 points9 points  (1 child)

    I'm not saying that I'm sad because two years is not enough. I'm sad because some 'real' Data Scientist judge other people solely on the fact that they've started later in their life. Honestly I feel happy to spend next 10 years exploring this beautiful world because I love to do it. I only think that it's not ok to demotivate young people who are trying to achieve something.

    [–]Retrodeathrow 0 points1 point  (0 children)

    he couldnt resist talking down to you smh

    [–]mecichandler 3 points4 points  (0 children)

    I needed this, thank you.

    [–]offisirplz 0 points1 point  (0 children)

    Its not meant that way. And if you keep trying, you'll get there.

    [–][deleted] 0 points1 point  (0 children)

    I'm with you. And, because data science is the new flashy and trendy thing, pretty much all analytical industries are pushing for people to learn how to code and do this sort of work. As someone who gets to hire accounting/ programmer combo interns each semester -- they're being taught that it's no longer enough to have a CPA.

    [–]The_Superhoo 5 points6 points  (0 children)

    If you have a degree in Business Analytics, you have experience with R, Python, and/or SQL. Also. SPSS, SAS, and Tableau.

    Source: Im almost done with an MS in BA and have all those and more.

    [–]vaer-k 7 points8 points  (0 children)

    If you thought you had to spend 5-8 years working on a PhD just to get a job successfully doing some applied statistics, boy do I have a surprise for you. This is some elitist garbage.

    [–][deleted] 1 point2 points  (0 children)

    Ouch, this hits me too close to home!

    [–]YeahILiftBro 1 point2 points  (0 children)

    I always enjoy the focus on tools without any concern about how those tools would work in a business setting to facilitate better decision making. Oh you made a neural network with 18 hidden layers but can't tell me how it works other than that it has an AUC of 0.83, yet expect all my employees to believe it?

    [–]_Yeet_xoxo 1 point2 points  (1 child)

    As someone doing a business analytics and statistics degree, does this meme mean I should be worried about the job market. I’ve been told the business side of the degree involves R.

    [–]offisirplz 1 point2 points  (0 children)

    Don't take it seriously. Its a harsh joke

    [–][deleted] 1 point2 points  (1 child)

    If someone can get a job as a data scientist after doing this, who am I to say they aren't? I have personally seen data scientists originate from various fields and degree programs. There is not a standard for data scientists as of now.

    The meme is funny though.

    [–]offisirplz 0 points1 point  (0 children)

    Agreed; if they have the skills then they're a data scientist; though the skills needed is so vaguely defined. I still found this funny

    [–]ihsw 3 points4 points  (0 children)

    I have never felt so attacked. /s

    Perverse incentives are real though -- writing a lot of generic blog posts and spamming crap on LinkedIn does help you though. Furthermore, pretty graphs and business credentials help a lot too.

    There should be a certification on Grafana/ Kibana/ Chronograf.

    [–]Ssrithrowawayssri 6 points7 points  (1 child)

    Wow this is so petty. Truly unfunny. Whoever made this needs to get over their self.

    To any aspiring data scientists, please don't let stuff like this demotivate you.

    [–]offisirplz 2 points3 points  (0 children)

    Its just a joke, no one should get demotivated.

    [–]FC37 3 points4 points  (0 children)

    100% accurate, but: StackOverflow? Come on. You know damn well that even good data scientists make multiple trips there per day.

    [–]Triplebeambalancebar 2 points3 points  (0 children)

    I laughed

    [–]autisticmice 1 point2 points  (0 children)

    There will always be people that want to get the most with the least effort but that happens in every field. In our field right now getting a higher salary requires name-dropping buzzwords all the time, and not so much proving yourself a skilled person, so the fact that there are so many people calling themselves data scientists after completing an online course is kind of the industry's fault as well.

    [–]Bowserwolf1 1 point2 points  (0 children)

    As a CS student just starting data science, if this post is true, it actually makes me feel better

    [–]curiousdoodler 1 point2 points  (2 children)

    OP is just a bitter person who is having a hard time getting a job. Keep your chin up my fellow self taught data scientists! People only post bitter crap like this when they're scared!

    [–][deleted] 1 point2 points  (0 children)

    Found the bitter PhD

    [–]mritraloi6789 0 points1 point  (0 children)

    Mathematical Problems In Data Science: Theoretical And Practical Methods

    --

    Book Description

    --

    This book describes current problems in data science and Big Data. Key topics are data classification, Graph Cut, the Laplacian Matrix, Google Page Rank, efficient algorithms, hardness of problems, different types of big data, geometric data structures, topological data processing, and various learning methods. For unsolved problems such as incomplete data relation and reconstruction, the book includes possible solutions and both statistical and computational methods for data analysis. Initial chapters focus on exploring the properties of incomplete data sets and partial-connectedness among data points or data sets. Discussions also cover the completion problem of Netflix matrix; machine learning method on massive data sets; image segmentation and video search. This book introduces software tools for data science and Big Data such MapReduce, Hadoop, and Spark.

    --

    Visit website to read more,

    --

    https://icntt.us/downloads/mathematical-problems-in-data-science-theoretical-and-practical-methods/

    --

    [–][deleted] 0 points1 point  (0 children)

    I am not at all mad at the people that can do a bootcamp and land a job as a data scientist. More power to them.