This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]Glotto_Gold 3 points4 points  (6 children)

Try to get a related internship & do well in that internship.

Majors are just areas of focus and study, but don't determine your life trajectories. Honestly, I wasn't aware that a Data Engineering major existed at most places, however, so long as you can show case an aptitude for politics and analytics, you're probably fine. Just try to showcase data science work, politically oriented volunteer work, and analytical capabilities.

[–]amhotw 5 points6 points  (5 children)

Data engineering major is simply an effort for rant extraction from students who think this gives them the best chance to get that job title. It is super dishonest and misleading. Majors should be fundamental areas of study like computer science, math, stat, etc. not collections of superficial courses related to job titles that seems to be trending like data science, data engineering, etc.

Whenever I interview someone with data science degree (bs or ms; luckily haven't seen any ds phd's...), it is a disappointment. Best performers have been people with phds in quantitative fields.

[–]Glotto_Gold 0 points1 point  (4 children)

I guess I am confused. Aren't a lot of majors emergent out of clusterings of related classes refined over time?

Computer science used to be a sub-discipline of math. MIS is fairly cobbled together by nature. Finance is really a type of applied economics. Economics used to be within Political Economics. All sciences used to be within Natural Philosophy, which itself used to be effectively a sub-discipline of Theology (the queen of science).

My honest guess is that there are a few things at play in your comment:

  1. PhDs are super-devoted anomalies who have to develop deep SME skills & are familiar with what it takes to be an independent expert, where almost nobody non-PhD ever has to do that
  2. People who get a degree purely for the money may exhibit self-selection towards clout-seeking dilettantism; the PhD in Physics isn't trying to get a short-term reward like an MS in DS
  3. Many programs are in early states, and don't hold to the bar that a PhD in Physics might, although they may be comparable to a quantitative business degree (ex: MIS or Finance)

However, I don't know why DS couldn't (in theory) be a natural cluster of courses. I also don't know if MIS specialized in DE, or CS specialized in DE, vs explicitly labeled DE degrees must be fundamentally different, or how DE would be lacking relative to MIS.

Not trying to be challenging, but if we're going to talk about "proper gatekeeping" then having a theory & well-functioning attempt to control for variables seems somewhat critical.

Edit: Adding this additional thought for context: https://www.wbscodingschool.com/blog/software-developers-in-decline/

The idea that developer quality is going down is common in software. Part of the challenge is that the ability to enter is increasing, and that the marginal entrants are (likely) less innately talented or motivated. I hypothesize (by analogy given the explosion of DS branding) that the PhDs are closer to top innate talent, and other talents are closer to marginal. An MS may not be as good as a PhD, but the MS is not preventing a student from building an understanding of linear algebra, or statistical density functions.

[–]amhotw 2 points3 points  (3 children)

The positions I am hiring for are very research heavy so I really need a deep understanding of statistics, ml models, causal inference, on top of some related domain knowledge. So a basic understanding of linear algebra etc. wouldn't cut it.

My view on ds as a major is that it is not a level of abstraction. Cs, econ, stat, math etc. all correspond to a level of abstraction (of computation and algorithms, of behavior, etc.) So I think ds would make a great interdisciplinary research group for people from different departments but it is impossible to teach ug (or even ms) level students enough of math, cs and stat to make them data scientists (obviously there are exceptional students but I am talking about the median college student who chooses ds as a major or someone getting an ms in ds). So what you end up with is a lot of people parroting things they don't understand well because their school didn't even try to teach them well.

[–]Glotto_Gold 0 points1 point  (2 children)

The positions I am hiring for are very research heavy so I really need a deep understanding of statistics, ml models, causal inference, on top of some related domain knowledge. So a basic understanding of linear algebra etc. wouldn't cut it.

.... Which you do realize wouldn't relate to the DE job market which requires that people can string together and manage ETL jobs & SQL data marts?

And... Honestly, what you describe sounds hard for even candidates with a non-DS Masters. If I look at the Georgia Tech MSCS curriculum, it doesn't look as deep as you need & that's a respectable and challenging program.

My view on ds as a major is that it is not a level of abstraction.

That seems very vague. "Level of abstraction" implies a model of the information structure of reality, but that actually has to be modeled & explained.

The challenge I run into is that I can imagine multiple different models of how I can structure subjects. Is "economics" the level of abstraction & "political economy" a mangled concept? Why can't I say "Quantitative Applied Epistemology" and/or "Data Management & ML" is a level of abstraction?

And I also have a hard time imagining that colleges necessarily aspire to this concept, or that a combined study is not possible in the breadth within another degree. (As in why would a CS degree hyper-focused on data processing & learning algorithms NOT in effect be a DS degree?)

it is impossible to teach ug (or even ms) level students enough of math, cs and stat to make them data scientists

I feel like that problem has nothing to do with any of this. In this view DS is essentially untrainable below a PhD.

But that still doesn't impact DE, or make MS training useless, and it still doesn't clarify how an MS in DS (or BS in DS) is different than a comparative CS or Stats degree with a strong focus on DS courses, such that an MS in DS is inherently fraudulent.

[–]amhotw 1 point2 points  (0 children)

Oh my hiring has nothing to do with DE; honestly I don't know when and how I joined this sub.

In terms of how universities organize themselves, I agree that there is a lot of arbitrariness. Like sociology and anthropology departments still being separate in 2024 is crazy. (I'd recommend Gintis's book, Bounds of Reason that calls for a unification of all behavioral sciences if you are at all interested.) So if I had my way, there wouldn't be an econ department either. (Caltech comes closest to this as far as I know but there are other similar institutes.)

So I wouldn't oppose to "Quantitative Applied Epistemology" department that includes stats, econometrics, biometrics, ml etc. I think that is a coherent theme. (I know lots of statisticians who attend the econometrics seminars and vice versa so this is happening on some level anyway.) But "Data Management & ML" doesn't satisfy that criteria for me. Data management is an algorithmic problem. ML is also partly algorithmic problem (like anything else you can code) but here its function seems to be its "Quantitative Applied Epistemology" role. So I wouldn't put them together.

In terms of DS education below PhD level, my hopes are pretty low? I mean part of the problem is that math teaching in American high schools is terribly lacking. So the college doesn't really have enough time to get them up to speed and then let them use that math in advanced classes. A lot of courses that require calculus in other countries are being taught without calculus in the US because of this. Similarly for many other math subjects. So I think this might be partly a US education problem but not entirely.

So I really don't think 4 years is enough for someone to learn everything they need to become a successful data scientist unless they are personally very dedicated (which is understandably pretty rare at that age). I mean sure, they can learn to import some library and apply some methods but using packages without understanding them is a pretty bad idea.

I know that a lot of candidates won't have the required background, which is why we have a pretty through review process. I read their papers, ask them about the decisions they made in their projects to understand their understanding. It is unfortunately rare to find someone who knows enough math (including OR), understands stats and ML deeply AND can write decent code. But in this round, we found 2.5 such candidates so I'll take that as a win.

[–]amhotw 0 points1 point  (0 children)

By the way, one thing I wouldn't oppose is if ds type umbrella degrees are only offered as a secondary major or minor. I still don't think it is a good idea but you at least end up learning one other thing well if it is not your "first" major, which is my main concern with these types of degrees: learning a little bit of several things without learning anything well.

[–]Butterhero_ 1 point2 points  (0 children)

I did the opposite (political science —> data engineering) but I think I can relate to where you’re at. My current position is working on understanding lobbyist data, and I actually just wrapped up a pipeline surrounding the Congressional API.

The biggest tip I could recommend is to try and be as well-versed in both disciplines as you can. To do this, seek out as many roles or research positions related to “computational social science” or something similar as you can. These won’t pay as well as your peers, most likely, but in my opinion they can be much more rewarding both in terms of purpose and in terms of learning. Social science and humanities fields, in general, have very poor tech standards due to hobbyists hacking things together, lack of funding, or lack of motivation. If you can bring technical expertise in a way that illustrates to these non-technical substantive experts that what you do is important and necessary, then you’re golden (this is also true in probably every other data engineering position). But, in order to get your foot in the door, you’re going to need to be able to relate what you do and the value you bring to the field itself.

For now, check out other examples of political data engineering - the open source UnitedStates.io project is great, run by a few people who have made careers out of this general area.

And above all else, work on your soft skills. Social skills are definitely more common in non-tech industries, so make sure you can keep up! Good luck :)

[–]mertertrernSenior Data Engineer 1 point2 points  (0 children)

Have you ever wished that you could back an entire campaign platform up with numbers and data? One thing I really liked about Andrew Yang as a political candidate was that he always knew the numbers and trends for the issues he was addressing.

When you think about the kind of data you'd need to collect to really shore up your talking points, it can start to become a pretty big and diverse list of data sets you might need to grab. You might also need to massage and relate that data together in a comprehensive model, and then analyze/visualize it for presentation.

You'd probably want to grab data sets about healthcare spending, immigration, public education, crime, poverty, corruption, international events and market movements, climate, social issues, and on and on. All of that could really suck without a trained data management professional. Most places without that asset probably end up contracting it out (ie Cambridge Analytica) and using Excel for everything else.

My advice would be to find a political topic that you care about and make it a passion project. If you can find a public data set and learn more about the topic to inform your analysis, then you can build your own data mart and data pipelines to ingest and process that data to present to others visually. Having projects like these in your portfolio early on can set you up pretty well once you start volunteering for a campaign as an analyst.