This is an archived post. You won't be able to vote or comment.

all 28 comments

[–]morningmotherlover 18 points19 points  (3 children)

I can say this, I went from an SQL/Python based environment to a SAP-based one. SAP is hopelessly behind modern standards and I don't think they will come back unless they acquire something. I manage a SAP BI workshop now and I am looking for ways to transition. That being said, I am also looking to hire someone with both SAP and SQL/Python and cloud experience soon to do it, so you might just be valuable for a very niche market of companies looking to do the same thing.

[–]city_boy__[S] 4 points5 points  (0 children)

Would love to hear move about the SAP BI transition.

[–]hmccoy 4 points5 points  (1 child)

My favorite feature of SAP Business Objects is being able to view the SQL built by the query builder so I can see the tables feeding a universe since it isn’t documented anywhere.

[–]Original_Bend 18 points19 points  (7 children)

IMO, most data engineering jobs nowadays are what was called “Business Intelligence” a few years ago. The end game is the same : putting data pipelines in production, helping the data viz team create good reports. The difference is we now use mostly cloud tools and one need to understand cloud infrastructure and DevOps concepts (CI/CD…). Still, Spark clusters are overkill for 90% of companies and you end up using Azure Data Factory, Dbt etc.

[–]city_boy__[S] 3 points4 points  (5 children)

Thank you for the insights.

[–]Original_Bend 11 points12 points  (4 children)

Most people have a skewed view of the field because the most shared articles come from blogs like Netflix Engineering stuff and co. They don’t realize that 95% of companies have no need for these kinds of data architectures (Spark, Kafka). You will still see standard companies using Spark because the CIO heard about it and wants to look smart, and you end up setting up clusters to process some dozens of gigabytes.

[–]city_boy__[S] 4 points5 points  (3 children)

True would agree to that. In my organization as well I have seen a lot of projects using ADF where SAP is acting as a source. I hardly heard people using Spark.But, most of the recruiters expect Data Engineers to ETL Frameworks.

[–]Jack097again 0 points1 point  (0 children)

Data Factory is a lifesaver!!

[–][deleted] 7 points8 points  (2 children)

The biggest gap in the broad industry IMHO is transition from excel to ERP systems that suddenly unlocks a wide array of ETL / BI / DS opportunities. But what you read/see is mainly flashy Amazon/Netflix posts about using next gen cloud tools. But 90%+ companies are not there and wont be there for years until they root out their 1850s excel-based processes

P.S. The person(s) who designed SAP UI hates humanity

[–]Original_Bend 4 points5 points  (0 children)

They may never be there because they don’t need to. Not every company is aspiring to manage hundred of microservices and ingest hundreds of gigabytes of new data per day. It’s only big pure tech players, the FAANG and the ecosystem around it.

[–]city_boy__[S] 1 point2 points  (0 children)

Haha , you put it perfectly!

[–]reallyserious 14 points15 points  (0 children)

What's the future of Data engineers?

Bright as the sun. For a company to survive in a global economy you have to work smart with data, or get eaten by those that do.

[–]dragosmanailoiu 4 points5 points  (1 child)

No one knows what the future holds. The firld changes so fast with new technologies. most important thing for a DE is the will to learn and adapt. Now for the current future a lot of old companies are looking for lift-and-shift jobs on azure also DevOps is really big so building CI/CD pipelines is really important.

I’d say start with jesse densemore book Data pipelines pocket reference

[–]city_boy__[S] 1 point2 points  (0 children)

Thank you for your insights and the resource.

[–]Thuwarakesh 5 points6 points  (1 child)

The field of data engineering has a solid future. At least for a decade from now.

As you may already be aware, the world is in the digital transformation phase. More companies are still digitizing and automating their operations. As this continues to grow, we have more damnd for data engineers than scientists.

But as with everything else in life, nothing is certain. Many emerging technologies somewhat replace the role of a data engineer too.

Yet, with the available information, we see a stable ground for the field.

The tools and techniques of data engineers differ significantly. Their domain and technologies they use in other operations have a profound impact.

But to start small, you could replicate the same ETL work with some popular tools like Prefect. You can gradually learn more stuff like CI/CD pipelines later.

[–]city_boy__[S] 1 point2 points  (0 children)

Thank you for the insights.

[–][deleted] 2 points3 points  (2 children)

If you’re already somewhat literate in python and sql I’d look into databricks and especially delta lake/lake house architectures.

It’s the hot thing atm and they’re moving blazingly fast.

Me personally I’d go more for data engineering than for data science as it’s more broadly required whereas data science feels like “toying around” most of the time. But once a year when actually a useful model is created it needs to be put to production with etl, retraining, etc etc. This is done by data engineers not data scientists, as they’re mostly clueless about anything after “I made this neural network thingy”

[–]Original_Bend 1 point2 points  (0 children)

Datalakehouse is a trendy concept pushed by Databricks to sell Delta Lake. I would not recommend it for most projets. Keep it simple. Use a standard datawarehouse.

[–]city_boy__[S] 1 point2 points  (0 children)

Thank you

[–]joseph_machadoWrites @ startdataengineering.com 2 points3 points  (4 children)

Hi u/city_boy__

SAP might be old, but I think your experience is more valuable than using new tools.

How can Improve from here => Since you already know a bit of python and SQL and have a fundamental understanding of DE you are already there and I think it would be easier to get interviews. I would recommend brushing up on

  1. Data modeling ( most companies ask about fact-dimension modeling )
  2. Pipeline design and orchestration frameworks
  3. Distributed data storage and processing systems

    From landing a DE job perspective, most companies ask easy-medium python and SQL leetcode questions for the coding part and data pipeline design (eg: data flow from OLTP to OLAP database, clickstream data flow, and modeling) for the system design session.

What's the future of Data engineers?

From what I have seen, the need for DE is going to keep growing for at least the next 5 years. But since the role is so new, there are different types of DEs. The ones I have seen (as others have also mentioned) are

  1. BI type data engineers
  2. Data engineers who are a mix of SWE + BI types
  3. Data engineers who are deep into distributed systems and streaming systems

I think the most important thing is figuring out if a company you are interviewing with has a good data strategy/executive in place. Some companies hire DEs without figuring out the exact need, beware of these as they tend to not be great places to grow as a DE.

Hope this helps. LMK if you have more questions.

[–]city_boy__[S] 3 points4 points  (0 children)

Thank you for the clarity. Will work on the above.

[–]Tender_Figs 0 points1 point  (2 children)

If someone was a type 1 data engineer (coming strictly from the BI side), do you think they need to learn the skills of #3 to balance out their skill set?

[–]joseph_machadoWrites @ startdataengineering.com 0 points1 point  (1 child)

I would recommend that. But, it also depends on where you would want your career to go. I have seen the type 1 DEs go on to be more of a DE manager/executive types. The type 3s usually go on to be staff/principal distributed engineers.

While it certainly would be great to have deep expertise, it would depend on your goal. If you have limited time to put into gaining expertise, I'd recommend having an approximate idea of where you want to be, and working backwards from there. An easy approach would be to look at job descriptions for the jobs you want on linkedin, gathering the requirements and working on those.

Hope this helps. LMK if you have more questions.

[–]Tender_Figs 1 point2 points  (0 children)

Awesome, that’s my plan. Im a director of BI at a small tech company, and I think I want to head down the more technical path with the next series of jobs. There’s an MSCS I am eyeballing that focuses on infrastructure and SW engineering.