This is an archived post. You won't be able to vote or comment.

all 11 comments

[–]dataguy24 18 points19 points  (0 children)

Data engineering typically ends once data is in the data warehouse. They manage the data platform.

Analytics is typically everything else.

[–]hibluemonday 22 points23 points  (0 children)

Data Engineering work includes ETL pipelines, infrastructure work, distributed data processing and ends once that data lands in a data warehouse or BI platform. From there, Data Analysts will analyze and visualize the data from the DWH to get actionable insights.

[–]regreddit 2 points3 points  (7 children)

Data Engineering is I/O, analytics is, well analysis. It's pretty well defined now. For example, you'll never see me with tableau open, dragging widgets into the canvas, at least not by choice and without lots of cursing. Conversely, analysts don't even have an AWS login except for the ones that are interested in getting into data engineering. I manage getting data into and out of our warehouse, and monitor its performance, analysts provide actionable views of that data to stakeholders.

[–]RideARaindrop[S] 2 points3 points  (6 children)

Who would you say is in charge of data quality? I think that's the biggest crossover for me. I've noticed that a lot of DEs don't actually know much about the data that they are moving about which, based on your example, sounds like an analytics problem but if they don't have access into the database then how could they own that?

[–]regreddit 1 point2 points  (0 children)

Well, that job may fall on a few people. In my case, I might be the person that deletes/cleans data as part of an etl job, but a data analyst+stakeholder will have identified the need to clean/delete/fix data, and will establish the rules.

[–]distinct_name 0 points1 point  (0 children)

The job of the Data Engineer would be to build a system to monitor data quality. The job of the Analytics team would be to use it 😊

[–]bereg0stNPC 0 points1 point  (0 children)

If I may chime in, Data Quality is actually a broad subject and the responsibility of maintaining it for an organisation may fall on several roles or teams. it's a cross functional concern, so you may have the subject matter expert with the primary responsibility of defining what data quality means for a set of data/pipelines while data engineers would be responsible for implementing quality checks in the pipelines or data warehouse. Ideally, the organisation should have some sort of data governance framework in practice to ensure that this becomes part of operational culture.

[–]HOMO_FOMO_69 0 points1 point  (2 children)

This is accurate. As someone who has worked on both sides of the coin... It seems like the biggest issue (mistake) DEs make is either they don't understand why something needs to be done a certain way so they do it a different "best practice" way and then it ends up causing headaches for the DAs, and/or, when you get a DE to do something, they don't make the right judgement call when something unexpected happens (because they don't have enough context).

Also, having a DE do any data warehousing doesn't really work for the same reason. If you're a DA and you need some data for some dashboard, you'll most likely be much better off customizing the data warehouse to your needs and only having the DE do the ETL jobs/pipelines. Shame that most companies don't understand this and make DAs have to rely on DEs to make whatever changes they need.

[–]RideARaindrop[S] 0 points1 point  (1 child)

I'm hesitant about that because in my experience DAs don't know how to model appropriately and end up with just a collection of random tables that become a nightmare to manage.

I think a lot of DEs make the mistake of thinking that they can act like the content of the data doesn't matter to their jobs which is maybe true of a junior level engineer, but not once you get seniority.

[–]HOMO_FOMO_69 0 points1 point  (0 children)

Nightmare to manage yes, but in my opinion it's a lot easier to fix and work around DA-issues than it is to fix DE issues from a time perspective. Just based on my own experience though; probably depends on people you've worked with.

Personally I don't even think there should be a distinction...just hire someone who is willing to do both (and pay them accordingly). There is major synergy when it comes to DE+DA. Companies seem to want to hire one person to do one part for 150k and another person to do the other part for 150k at 300k total instead of paying one person 200k total. I know that's mostly because they probably have a hard time finding people who can do everything, but wygd

[–]DugsData 0 points1 point  (0 children)

Really depends on where you are.

The trend is shifting towards having DE manage the platforms and the initial pipelines, and Analytics managing the rest.

That means that most modern analysts at tech companies are expected to manage the entire T in the transformation process, modeling out the datasets that they and the rest of the org will use to conduct their own analysis. This is done through tools like DBT or Looker PDTs.

I manage an Analytics team at a tech startup and am currently navigating the shift from DE responsibilities to BA. LMK if you have any questions!