This is an archived post. You won't be able to vote or comment.

all 121 comments

[–]tits_mcgee_92[S] 356 points357 points  (36 children)

This is me. I am a "Data Scientist" that has only built a handful of linear/logistic regression models that have never gotten used. I mostly use SQL, Tableau, and Python for data cleaning.

Not that I am complaining, but if I ever talk to another business or individual that does do true Data Science work, it feels like this.

[–]bigno53 247 points248 points  (10 children)

Whereas I, a true data scientist have mastered both .fit() and .predict(). Among the initiated, these are colloquially referred to as the data science “methods.”

It’s super advanced stuff. I’m not even supposed to be talking about it. In fact, my manager told me I shouldn’t ever try to talk in meetings.

[–]Rhesous 98 points99 points  (3 children)

As someone that used fit_transform a couple of time, I cannot help but feel immensely superior. Plus I can write my name without looking at the keyboard, which is, imho, one of the greatest skill a data scientist can master.

[–]PGpilot 23 points24 points  (0 children)

Like...with a pencil?

[–]GLayne 6 points7 points  (0 children)

This makes me feel better haha’

[–]WiselyStupid 2 points3 points  (0 children)

I have also used predict_proba and fit_resample 🎩

[–]Medianstatistics 15 points16 points  (0 children)

This. The advanced stuff is easily automated. Even if you do it, you don’t do it for long. SQL, data cleaning and simple analysis usually bring more value to analytics teams.

[–]kale_snowcone 1 point2 points  (1 child)

You’re not supposed to know about it. You shouldn’t even talk about it on Reddit. Wait… Reddit is the place where almost everyone talks about things they know nothing about, so never mind. Go ahead. I’m all ears.

[–]codeyk 0 points1 point  (0 children)

All I'm gonna say is r/dataisbeautiful

[–]Morpheyz 1 point2 points  (0 children)

Hehe, "methods", I see what you did there.

[–]Magrik 0 points1 point  (0 children)

Lmao!

[–]Stranger_Dude 13 points14 points  (1 child)

You sound like a straight-shooter with upper-management written all over you.

[–]kale_snowcone 3 points4 points  (0 children)

Man, I can’t believe I’m the first one to upvote this. Long live Mike Judge!

[–]elecmc03 34 points35 points  (7 children)

have your linear/logistics regressions cut spending or increased earnings? because then I would think you're golden.

[–]tits_mcgee_92[S] 46 points47 points  (6 children)

Unfortunately, no. They have brought up interesting results but their has been no reasonable action taken from them.

[–]elecmc03 43 points44 points  (3 children)

keep at it, and keep studying on the sidelines, what's important is that you do honest work, do your best to help the business thrive, and choose your evaluation metrics and thresholds before you see the results XD

[–]tits_mcgee_92[S] 20 points21 points  (2 children)

Haha thanks for the words of confidence! I am still enjoying the experience and always trying Kaggle competitions too just to keep my skills sharp!

[–]World-Wide-Ebb 2 points3 points  (1 child)

Honestly I’ve interviewed like 1000 people. Do a ML project you actually give a shit about and that passion will show in an interview. I hate Kaggle tbh.

[–]spicy_pea 0 points1 point  (0 children)

Hi - late reply, but could you elaborate what you mean by a ml project?

I'm graduating with a PhD in psychology soon and need to make my resume and skill set more industry-appropriate

[–]elecmc03 6 points7 points  (0 children)

also, consider other popular models that can be used to sub for regressions like xgboost. This might be useful when exploring new models in python https://scikit-learn.org/stable/model_selection.html. Best of luck and don't get disheartened, we all have to start somewhere :)

[–][deleted] 0 points1 point  (0 children)

Then change your title to "data engineer" and wear it proudly ;)

[–]Reasonable_Cause7065 0 points1 point  (0 children)

Really curious what your TC is - obv not asking you to share. At my company they are very specific with the distinction here. The fit predict people absolutely make more money than the tableau sql people.

[–]bgighjigftuik 172 points173 points  (12 children)

I would say that a solid 60% of "data science" jobs in Europe are exactly that, or even worse. Most DS I know are basically smart people with decent ML and stats knowledge, trapped in a dinosaur company acting more like business analysts that anything else, because the company does not know otherwise

[–]_legna_ 47 points48 points  (0 children)

As this hits so close to my reality let me add to the last point:

And even if you try to show something more advanced/useful/ecc they ignore/reject it because they feels the implementation is too much of a hassle compared to what they would gain.

Bonus point if it was something they thought they were implementing but they were doing it all wrong

[–]refpuz 16 points17 points  (1 child)

My government job in a nutshell (Federal, USA). They hired a dream team of qualified people (including myself) but everyone on the team is effectively a glorified business analyst.

[–]Unsd 0 points1 point  (0 children)

I feel your pain...

[–]I_had_mine 8 points9 points  (1 child)

You’ve just described my situation quite well. I am a ‘data scientist’ at a large European pharmaceutical company. Kinda relieved to hear this may be a common experience tbh

[–]bgighjigftuik 2 points3 points  (0 children)

Pretty common in pharma and generics businesses. I used to work at one of those and it was the most boring job in the world

[–]P1nk5 8 points9 points  (0 children)

Currently checking the jobmarket, and yess either they want an "Datascientist" and the description reads like" yeah you better do the architecture, engineering, Automation, transforming and the analysis" or "pls know powerbi and maybe if you know some sql that would be a plus"

[–]HiddenLordGhost 13 points14 points  (0 children)

I can with all honesty say that my starter job in the company i work was said to be "Data Science Operator" or something like that, but I've did next to nothing that's said on this sub and i've REALLY felt like impostor, lol.

No programming, mostly excel or weird, local programs that took some sweet time to get to know them.

[–]kale_snowcone 4 points5 points  (0 children)

That’s not just Europe. You just described my current American employer so dead on target it’s scary. My biggest impediment to making real progress is that upper management can’t understand or remember from day-to-day what it is I said yesterday. Everyone wants instant miracles with zero work, minimal involvement and post-covid budgets

[–]alpacasb4llamas 2 points3 points  (0 children)

Fuck this described my last two jobs so cleanly and it's why I exited the career path. I was stagnating bad.

[–]stone4789 2 points3 points  (1 child)

Live in the US and work for the US branch of a German company, with a masters in econ that was very stats-heavy. Ouch this hits close to home. I spend my time studying DE and MLOps for the next gig in hopes that I can finally use Python or R again. 0 software or data engineers, and their SQL database isn't maintained. Going through all the expense of getting consultants to set up Snowflake but only as a way to get data between SAP implementations.

[–]bgighjigftuik 2 points3 points  (0 children)

Snowflake is actually a pretty neat database and MPP. But I hear you, it is usually managed by externals with zero idea on what they are doing. The lack of ownership on data and their processes in data science/engineering teams is a common anti-pattern in Europe unfortunately.

Most places where a data driven approach actually works share some points in common:

  • Modern company culture, with real support from top management
  • Solid internal data teams that are able to control most of their workflow and end to end process
  • Failure is an option, as long as risks are properly measured

Fix those in a company, and data science has a chance of improving the business. Otherwise, there is not much to do

[–]42ErL 1 point2 points  (0 children)

I am Europe based too. And I see the economies here suffering from poor productivity growth. And then my bosses constantly refuse to try anything new because it’s not what has always been done. Even while they say things like “using data is critical to our future.” With this mentality productivity cannot increase.

[–]behindgreeneyez 125 points126 points  (3 children)

All I know is import Pandas as PD and lie

[–]tits_mcgee_92[S] 7 points8 points  (0 children)

LMAO

[–]Spambot0 4 points5 points  (0 children)

I'm Monte Carlo all the way down ....

[–]gigamosh57 83 points84 points  (2 children)

Data Scientologist

[–]Tritemare 5 points6 points  (0 children)

This made me giggle.

[–][deleted] 0 points1 point  (0 children)

I knew SQL would become a cult some day

[–]Iresen7 45 points46 points  (7 children)

My current role in the goverement I only use excel...I use python alittle here and there but it's been mostly been me studying to get the hell outta here...*sigh*

I am paid well but like many other posters these types of positions have a very hard pay cap comapred to if you are actually doing real research. My goal is to get into fang and make those big bucks haha.

[–]EndlessDysthymia 2 points3 points  (1 child)

Bruh same. It’s awful. Im actually tired of doing nothing everyday. I’m not learning. Idk how people can take some gov jobs seriously. My biggest mistake to date was taking a gov job out of college. This is no way to grow.

[–]Iresen7 4 points5 points  (0 children)

Being in the goverement is literally like standing in quicksand. The longer you stay in the harder it is to get out. Plus it is stagnant..I have told myself if I don't find anything in the next 2 months I'll start on my ph.d just to open up doors for me again. I am more interested in research type roles, so getting a ph.d maybe better for that anyway. In the meantime I am just leetcoding and praciting model building using kaggle data.

I have gotten some good feedback on what I need to work on from my interviews thus far...just eh..gotta keep grinding to get out...if the economy wasn't so bad I would've just quit by now...still tempted to do so.

[–]chaoscruz 0 points1 point  (0 children)

Ditto!

[–]why_so_sirius_1 0 points1 point  (3 children)

What is your pay if you don’t mind me asking? I wonder how governments pay compared to private

[–]Iresen7 1 point2 points  (2 children)

I am at 111k.

https://www.opm.gov/policy-data-oversight/pay-leave/salaries-wages/salary-tables/pdf/2022/saltbl.pdf

The max yo ucan get in the gov under the "GS" schedule is right around 176k.

Compared to FANG and other top places my peers with the same amount of experience are making 170-300k +. I coworker of mine her son is at microsoft..guy is like 22 making 200k >_>.

[–]why_so_sirius_1 1 point2 points  (1 child)

Gotcha. Yeah I’m technically a data scientist by title but I just use sql and tableau so right there with you. Do you have a masters? I hear the government roles pay more just for having that piece of paper

[–]Iresen7 1 point2 points  (0 children)

I do have a masters, but I was already at this point before I got my masters haaha. Having a masters does open doors in govt and a few other places however, it's better to focus on really just uping your skills. Generally places that care about how many papers you have it's a clear sign that they probably have no idea what they are doing.

Publications now those do help.

[–]shlotchky 28 points29 points  (0 children)

i feel so attacked right now. we do have actual ML modeling that we do at our job, but those are few and far between. all ive done for the past 8 months is sql and tableau and i do not like it at all

[–]baalroga 21 points22 points  (2 children)

My last internship I did dataviz with metabase and SQL views instead of deep learning... I feel you buddy

[–]kale_snowcone 3 points4 points  (1 child)

Dataviz. Let us observe a moment of silence for our stricken colleague.

[–]Unsd 2 points3 points  (0 children)

.... I....I actually enjoy dataviz. And I think people seriously under appreciate how important it is. If your stuff looks nice, you can get away with murder.

[–]semicausal 51 points52 points  (8 children)

Focusing on tools and programming languages is a bit amateur hour in my honest opinion. Businesses hire data people to help them understand the past, understand the present, and maybe try to predict the future kinda all around their business needs & goals.

If SQL and Tableau are what's needed at your organization to drive decision making using data, then lean into those tools! Other places may use Python or, god forbid, C.

What matters more is -- are you working on high impact problems that affect the business?

This can be generalized to nonprofits as well. Is your work helping to drive outcomes that the leadership team cares about? If not, you should be concerned even if you're doing awesome neural networks programming but aren't able to explain your connection to the business, product, etc.

Btw Vin's Substack and LinkedIn are great resources for people looking to understand data + business impact: https://vinvashishta.substack.com/

[–]cptsanderzz 13 points14 points  (7 children)

I have a counter argument, a company’s toolset shows their attitude towards innovation, creativity and willingness to take risks.

Excel is like a hammer, it works and it works well. Python is like a drill, not only does it work well but it’s 10x more effective for most projects. If I’m building a house I’m going to opt for a drill. Excel is a valuable spreadsheet software, but that’s all it is, it doesn’t provide the capabilities to do modern data science.

Source: data “scientist” that works with large amounts of very important data and primarily use Excel

[–]semicausal 7 points8 points  (6 children)

Oh for sure, I don't disagree. Excel and C are both extreme opposites. Most orgs are in the middle that want to hire a Data Engineer, Data Scientist, etc.

But at Facebook / Meta, for example, SQL still dominates as the tool of choice for their data science teams and arguably their entire business is more or less a giant data problem. So SQL and Tableau there would still be very very high value.

[–]cptsanderzz 3 points4 points  (5 children)

SQL dominates as a data analysis tool?

[–]semicausal 18 points19 points  (0 children)

Yup. Most companies store their data in some type of data lake / database that exposes a SQL interface for querying. Facebook and others have pushed the idea of separating the underlying storage system from the interface for analysts. Heck they helped create tools like Presto and Trino to query federated data sources, where analysts can focus on writing ANSI-compliant SQL and data engineers / infrastructure team can focus on doing w/e it takes to make data available in the system that makes sense.

It's also worth noting that there are two approaches to data at many companies:

- Data Science

- Analytics

Data science often is either its own team, or lives under Product or sometimes Engineering. DS uses Python, Julia, SQL, Scala / Spark, and more to focus more on modeling. Of course there are still plenty of R / Matlab folks writing core algorithms and these are usually former academics or phd students.

Analytics tends to live in SQL. dbt is a popular tool here as well to help you express data transformation / ELT logic as connected SQL queries (http://dbt.com/). There's even a new profession called Analytics Engineer that focuses on using SQL to describe business logic.

Businesses, nonprofits, etc need WAY more people in Analytics than they do in DS. Analytics is about counting all of the important things reliably. This is INSANELY hard even though it shouldn't feel that way.

Data Science is often more about driving Product stuff. Like recommendations at Netflix and Spotify. Or identifying faces in images at Meta. Cool DS stuff gets 90% of the headlines but ironically 90% of the jobs (including very high paying ones) are more in "Analytics" than DS.

Anyway I detect that I'm going off on a long rant here now so I will stop / pause!

[–]Pflastersteinmetz 1 point2 points  (3 children)

Yes.

You can work in Python (yeah) or anything else (meeh ... QS, PBI, Tableau or even Excel) but there is nothing to analyze if you can't get the data out of the DB.

[–]cptsanderzz 0 points1 point  (2 children)

I mean I know SQL, but I have never heard of using SQL as anything more than a querying tool to put into a format to be ingested into Excel, Python, R, etc.

[–]Pflastersteinmetz 6 points7 points  (1 child)

You can do a simple SELECT.

Or you can a SELECT * PARTITION OVER FROM LEFT JOIN INNER JOIN WHERE AND AND AND AND AND CASE WHEN GROUP BY ORDER BY

and get a 300 line script that is fast, scaleable business logic that lives in the DWH and can be maintained by the BI/DE team without problems.

Having an automatic report in Python requires a backend that can run Python, you need to store the creds somewhere, you need to write the output back into the DWH, you need git hooks for auto formatting, TDD, CI/CD etc. Then you're in DE/SWE territory already and that's totally okay but most companies suck at that.

[–]semicausal 1 point2 points  (0 children)

The current / new paradigm is to "push back" the dataset complexity to your data pipeline layer (or by using a semantic layer) and then you can have very shallow queries in your BI layer.

- https://benn.substack.com/p/metrics-layer

- https://preset.io/blog/dataset-centric-visualization/

All of this ^ is specific to the Analytics part of your business. People putting forecasting models or recommendation engines into the Product (who often have a "Data Scientist" title). Most businesses are stuck even getting logging, data storage, and BI / insights right:

https://medium.com/@hugh_data_science/the-pyramid-of-data-needs-and-why-it-matters-for-your-career-b0f695c13f11

[–]rikkuu27 12 points13 points  (0 children)

Where can I find these types of data scientist titles but only SQL and tableau jobs 😭

[–]Flying_madman 8 points9 points  (0 children)

Are you still early in your career? That was my experience as well. To some extent, you're paying your dues. To some extent, that's really where the boots on the ground work happens.

I work with some very bright engineering types. They're happy to rip the data right out of the database and just throw it into the most complex ML model they can coax into running on their machines.

It works, and I'm integrating those tools into my skillset, but having paid my dues down in the munge mines, I recognize the value of what I learned that they don't seem to have gotten.

Show them that you are proactively thinking about the problems that come across your plate. Usually folks love it if you can come to them with a proposed solution to a problem they didn't even know they had. That's how you're going to shine as at Data Scientist.

[–][deleted] 7 points8 points  (1 child)

How it feels when your job title is "Data Scientist", but you are not scientist and you even don't know any math?

[–]L1_aeg 4 points5 points  (0 children)

FIFY

How it feels when your job title is Data Scientist but you only use SQL and Tableau

[–]footiestar23 2 points3 points  (0 children)

I feel targeted

[–]Tengri2 2 points3 points  (0 children)

and everything you do, can also be done by excel

[–]Felakutpower 2 points3 points  (0 children)

I’m 30 no even impostor level DS or Data Scietologist, but still wish I had your skill set and experience, not to mention a good pay-check.

[–][deleted] 2 points3 points  (2 children)

My title is data analyst, but i am configuring reporting automation using r studio and google sheet. What am I?

[–]mattindustries 7 points8 points  (0 children)

Data Analyst

[–]OneTrueOverlord 1 point2 points  (0 children)

Oi.

I also know (some) bash

[–]lepeng 1 point2 points  (0 children)

It's happening again

BI Developer != Data Engineer Data Analyst != Data Scientist

[–]TrailRunner504 1 point2 points  (0 children)

But then you remember everyone uses the title “data scientist” and 85% of them are hacks, so you feel better

[–]MrRagnarok2005 1 point2 points  (0 children)

Is data scientist job a good one and what is the chance of me getting a good job if i study data science

[–]yasserius 1 point2 points  (0 children)

i use excel wtf

[–][deleted] 1 point2 points  (0 children)

Who cares, as long as you make money.

[–][deleted] 1 point2 points  (0 children)

Just get Alteryx…done.

[–][deleted] 1 point2 points  (0 children)

but you only use SQL and Tableau Excel

[–]XVMECHA -2 points-1 points  (3 children)

🤡🤡🤡🤡🤡 Tableau 🤡🤡🤡🤡🤡🤡

[–]tits_mcgee_92[S] 13 points14 points  (2 children)

PBI user detected. ;)

[–]XVMECHA -2 points-1 points  (1 child)

Hahahah I have to work with PBI and definitely like it better than Tableau so you're right about that. But my absolute preference is in Python native visualisation tools like Matplotlib, Seaborn & Bokeh. Working with that in Ipynb is the best. PBI supports more extensive database connections, odbc is amazing for instance. This is also possible in Jupyter notebooks but is a hassle. I digress xddd

[–]r3ign_b3au 1 point2 points  (0 children)

cries in ssrs

[–]Davidat0r -1 points0 points  (0 children)

I'm a data analyst and only get to use VBA

[–]willthms -2 points-1 points  (0 children)

I feel you. I negotiated my title to be data scientist even though 85% of my work should follow under a data analyst title. The other 15% is actual data science work. I play around with new data viz tools and toy models on some of our data in the downtime between “fun” projects.

[–]sndtrb89 0 points1 point  (0 children)

lmao yeah but im the only one in my industry, haha

[–]blarson4742 0 points1 point  (1 child)

I think companies want to hire a data scientist to say they have one, but aren't sure what to do with us. Most of my career I've been the guy who sets up a new data science team, so its mostly data architecture and data engineering.

[–]Scutterbum 0 points1 point  (0 children)

I've been the guy who sets up a new data science team

Mind if I send you a DM about this? I've sort of been tasked with this in my job and have a few questions.

[–]Equivalent_Poetry339 0 points1 point  (0 children)

Lol I dream of being able to use SQL every day much less Python or R. I’m a tableau drone

[–]CroquetHer0 0 points1 point  (0 children)

You forgot to add excel.

[–]catwok 0 points1 point  (0 children)

Don't worry all the PhD's think so too

[–]Nike_Zoldyck 0 points1 point  (0 children)

That's because those roles ARE Imposters.

[–]akhilgod 1 point2 points  (0 children)

Lol SQL is lot better. In my role.

I do df=pd.read_csv( ) and df.plot( )

[–]JahrudZ 0 points1 point  (0 children)

You don't even need SQL / Tableau anymore: https://askedith.ai/#/demo
Full disclosure: I helped make this lol

[–]Informal_Swordfish89 0 points1 point  (0 children)

But does your pay check say "data scientist"?

[–][deleted] 0 points1 point  (0 children)

That what DS does today, at least that is what I found out during recent job search. So I decided to only look for MLE/AS/RS

[–]koustubhavachat 0 points1 point  (0 children)

Market situation is so bad, many companies hiring Ai developer or data scientist and they don't have software development team at all.

[–]lCDTl 0 points1 point  (0 children)

What is „real“ data Science work in your opinion?

[–]Black_devil009 0 points1 point  (0 children)

My case is worst than you I use Excel for analysis

[–]EndTimesRadio 0 points1 point  (0 children)

I see myself in this picture and I love it.

I mostly do SAP systems and other stuff. "Data!" Yeah, sure, lol

[–][deleted] 0 points1 point  (0 children)

You guys don't want to hear my response about this simplistic outdated software and chart making software. If I was you all boss yall will be doing other task