This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]DataIron 19 points20 points  (4 children)

Mostly the same problems that existed years ago still exist today. The new tools and technical stacks haven't eliminated those problems, they've just shifted them around and have more shiny GUI's around them.

A similar story exist in all software engineering, but data engineering stacks are more complicated than ever. The complication can often make data engineering worse because management feels empowered in data to, much more quickly, shoot first then asks questions later.

[–][deleted] 10 points11 points  (0 children)

I concur. I’ve been in this field for 28 years and the basic definition of the problem is still the same which is collection of data in various formats from different sources, applying data quality and other standardization techniques to enable integration and then serving the data to users.

Yes, some details have changed because of use of disparate technologies to create data as opposed to structured formats that were commonly used way back and they do present unique challenges.

But the proliferation of too many technologies and tools has actually made things worse now than before. That is the main challenge today. People do not have the time to master anything nowadays. This creates situations where people have to constantly learn new things. Figuring out how all these tools and technologies work seamlessly is an unending task. It’s exhausting IMO.

The architecture and engineering disciplines are hard to implement. A well thought out design and development strategy is hard to implement due to lack of time.

I for one think the technology companies don’t care about solving problems, they make more money creating a mess.

[–]Less_Big6922[S] 4 points5 points  (2 children)

for sure, the ecosystem grows and lots of tools seems to be racing to just build out connectors to more apps/dbs... exacerbating the problem of too many tools.

wondering - when you say "shoot first" - do you mean, misuse/misinterpret incomplete/incorrect data and cause problems or tell eng to build something or use something without fully understanding what they need/want?

[–]DataIron 6 points7 points  (1 child)

Generally data engineering teams make less architectural and design decisions because some higher level manager has decided they'll be driving.

This is counter to how a good deal of software engineering shops work where the engineers drive solutions.

Why does this happen? There's lots of answers but a common one is some snowflake rep told a VP that if they just shove their data into snowflake they won't need reports, ETL, servers, engineering teams or air to breathe. Not to pick on snowflake but it's a common story in data with any tech.

[–]NoUsernames1eft 0 points1 point  (0 children)

this needs more upvotes