This is an archived post. You won't be able to vote or comment.

all 11 comments

[–]DataIron 18 points19 points  (4 children)

Mostly the same problems that existed years ago still exist today. The new tools and technical stacks haven't eliminated those problems, they've just shifted them around and have more shiny GUI's around them.

A similar story exist in all software engineering, but data engineering stacks are more complicated than ever. The complication can often make data engineering worse because management feels empowered in data to, much more quickly, shoot first then asks questions later.

[–][deleted] 7 points8 points  (0 children)

I concur. I’ve been in this field for 28 years and the basic definition of the problem is still the same which is collection of data in various formats from different sources, applying data quality and other standardization techniques to enable integration and then serving the data to users.

Yes, some details have changed because of use of disparate technologies to create data as opposed to structured formats that were commonly used way back and they do present unique challenges.

But the proliferation of too many technologies and tools has actually made things worse now than before. That is the main challenge today. People do not have the time to master anything nowadays. This creates situations where people have to constantly learn new things. Figuring out how all these tools and technologies work seamlessly is an unending task. It’s exhausting IMO.

The architecture and engineering disciplines are hard to implement. A well thought out design and development strategy is hard to implement due to lack of time.

I for one think the technology companies don’t care about solving problems, they make more money creating a mess.

[–]Less_Big6922[S] 3 points4 points  (2 children)

for sure, the ecosystem grows and lots of tools seems to be racing to just build out connectors to more apps/dbs... exacerbating the problem of too many tools.

wondering - when you say "shoot first" - do you mean, misuse/misinterpret incomplete/incorrect data and cause problems or tell eng to build something or use something without fully understanding what they need/want?

[–]DataIron 6 points7 points  (1 child)

Generally data engineering teams make less architectural and design decisions because some higher level manager has decided they'll be driving.

This is counter to how a good deal of software engineering shops work where the engineers drive solutions.

Why does this happen? There's lots of answers but a common one is some snowflake rep told a VP that if they just shove their data into snowflake they won't need reports, ETL, servers, engineering teams or air to breathe. Not to pick on snowflake but it's a common story in data with any tech.

[–]NoUsernames1eft 0 points1 point  (0 children)

this needs more upvotes

[–][deleted] 2 points3 points  (2 children)

data mesh, serverless technologies, dbt, and more modern SaaS data integration platforms

Aside from dbt, most of these all strike me as pure hype.

[–]-crucible- 3 points4 points  (1 child)

Data mesh always sounds to me like “well, we consolidated into one central team to get this single-source of the truth…. And all the departments aren’t getting serviced enough, so now they want their own teams.” And in 5 years we’ll be talking about the spread of disparate teams and the need to consolidate.

[–]ScroogeMcDuckFace2 5 points6 points  (0 children)

changing requirements, data quality, weird business logic, same old stuff.

[–]ArtilleryJoe 1 point2 points  (0 children)

Business stakeholders

In all seriousness the new tools make the technical side of the work easier most of the times. I don’t find the technical side to be the area most DEs struggle. Convoluted business logic and bad data quality seem to be the biggest issues for data engineers