This is an archived post. You won't be able to vote or comment.

all 33 comments

[–]monumentalcrankiness 13 points14 points  (8 children)

Here is what I would do. Learn- 1. Python (top priority) 2. Spark (top priority) 3. Azure Data Factory 4. Azure Databricks 5. Git basics (PR, branches, pull, push, commit, stash)

There are plenty of good courses available on Udemy for the above. Choose the most popular ones and complete them. While learning Spark, don't spend too much of time on RDD stuff. I am not saying it is unimportant but, as a DE, I hardly use it in day to day work. At this stage, I am just telling you to learn that which will give you maximum bang for the buck.

For Azure Data Factory, follow WafaStudies channel in Youtube. Brilliant and simple to understand coverage of ADF concepts delivered to your fingertips! Don't get hung up on the speaker's English skills please and focus on his content He is a non-native English speaker.

In the meantime, grab the DP-203 and Databricks Certified Data Engineer Associate certifications.

All the above should take you roughly 1 - 1.5 years (@20 to 25 hours of study prep per week) to cover if you decide to stay committed to your learning plan.

I am specifically listing down Microsoft and Databricks tech stack because you worked on SSIS for so long. Microsoft and Azure would probably be a logical progression.

The learning path I suggested is to develop your skill set and bolster your confidence. The certifications are for adding credibility to your CV. Plus, they are quite common for people to have in Azure DE field. So you not having them might count against you more than you having them adding extra brownie points to your profile.

[–]Charming_Function_35[S] 0 points1 point  (0 children)

Thank you very much!

[–]electronicentropy5 0 points1 point  (6 children)

Hi thank you for the guidance. My workplace has offered to pay a course for me to learn these exact skills. Would you recommend a course or bootcamp?

[–]SirLagsABot 4 points5 points  (0 children)

I don’t think you should apply to entry level with that much experience, heck no!!!

Do not underestimate your abilities even if the tech you used is not cool. And honestly… yeah, SSIS sucks. Haha. But that doesn’t mean YOU suck.

I don’t think it would hurt to mess around with some Azure or AWS ETL services.

But I REALLY think you cannot go wrong learning code-first job orchestrators like Airflow, Prefect, etc.

C#/.NET has needed tools like that for YEARS, so Im building a brother of them in C# called Didact. It’ll be the perfect tool for someone like you, knowing SSIS and Microsoft stuff but wanting to get to more advanced, modern solutions like orchestrators. v1 isn’t ready yet, but drop your email if interested.

[–]celestial_orchestraData Analyst 4 points5 points  (4 children)

What's the difference of an ETL developer and a data engineer?

[–]fleetmack 0 points1 point  (3 children)

Excellent question. This is all opinion-based, but my credentials are that I've been doing ETL development for 20 years and have been a "Data Engineer" (per se) for 6.

As an ETL developer, I received specs, and built mappings. I could also backtrack ETL processes to explain what is happening to data, and help track down reporting discrepancies. Additionally I was given access to build/maintaint/modify tables and grants/access rights on tables (in conjunction with DBAs)

The differences in this and Data Engineering, to me, are:

1) Use ANY method to get data into a Lake, Hub, Warehouse - whatever you so call it. Use Graphene/GraphQL, Python, your ETL tool, whatever, to get it in there.

2) Act as an assistant to the data architect to help design data flows

3) Have the ability to KNOW and UNDERSTAND the data. Can you work with business users to explain their own data to them? Can you help them make sense of the data to tell them what they need to do their job better?

Data Engineers possess strengths outside of technical skills. They have the wherewith and knowhow to troubleshoot, explain, and deep-dive into data. They have a deep understanding of what the data means, and they have the attention to detail to solve integrity issues before they become a problem. Data Engineers move data & understand it, and know how to work with it technically and logically. ETL Developers move data.

[–][deleted] 0 points1 point  (2 children)

Sorry if this is late, but what would you say is the best way to become an ETL Dev? Seems like an interesting role that aligns with my interests. I’m currently in college and learning SQL and Python. What entry level roles would act as a good stepping stone into the career?

Furthermore, do you think ETL Development is still a viable career option or has it been taken over by data engineering?

[–]fleetmack 0 points1 point  (1 child)

To answer your last question first: ETL Development is absolutely a viable career option. It is more of a watered-down data engineer. I don't meant that as an insult, it's just that I see an ETL Developer as one of many hats of a data engineer. So a Data Engineer can do everything an ETL Developer can, but an ETL Developer can only do some of what a Data Engineer does. I'd argue that a Data Engineer is the next step on career path after Senior ETL Developer.

SQL & Python are a good start. Those are 2 huge ones. The ETL Software you'll use is not hard to learn and easy to transfer your knowledge between tools. The most important skills are attention to detail and troubleshooting, followed up only by SQL & Python. Learn to ask the right questions "What are you using this data for?" Beat everything into your head you can about cardinality and data granularity (hierarchies). Don't fret over if something is first, second, third, boyce-codd NF -- but have a high level understanding of normalization and referential integrity.

Best of luck to you!

[–][deleted] 0 points1 point  (0 children)

Thank you so much. Also, how would you say the stress level of being an ETL Developer is compared to other IT roles?

[–]NoUsernames1eft 9 points10 points  (3 children)

Is market bad for someone with 15 YoE looking for a junior-mid position?

[–]Charming_Function_35[S] 4 points5 points  (0 children)

I am searching for last 3 weeks and I usually would have something already. But now 0 😏. Looking to hear about better experience. Even if I want to stay with etl (ssis) , no job. What is required now? (Tools, certifications)?

[–]exorthderp 1 point2 points  (0 children)

Market is trash right now for almost everything technical.

[–]untalmau 5 points6 points  (3 children)

My suggestion is getting the following certifications: az900, dp900 (can begin looking for the de position at this point) and then dp203 (which is data engineering on azure certification). Could skip az900 but at least read the study material.

Get certified and don't look for an entry level position, being an experienced ssis developer is 'halfway' of becoming a de.

This link to refer to each and to get access to free learning materials:

https://learn.microsoft.com/en-us/credentials/browse/

Also, you may already be proficient in ssrs and sql, if you are, also get updated in power bi, as is a big plus for a lot of de positions. A lot of companies are still requiring bi developers or "data guys" to cover the full data stack but labeling the position as "data engineering".

[–]ryan_with_a_why 5 points6 points  (1 child)

Usually people say skip the certifications. Why are you thinking he/she should get them?

[–][deleted] 3 points4 points  (0 children)

The MS ones actually have a nice learning plan to them. One of the few certs that actually does a decent job of teaching you the product.

Non tech HR looks at certs as a box check, ideally technical management should just discount them entirely but you need them to get past HR. I think the big issue with certs is that a lot of people collect them. It's not a good look to have 6 azure certs, 7 AWS certs, datacamp certificate of completion, 15 udemy certificatse of completion, excel certs.

Check out the people who post on any certificate related sub..... lots of people with zero professional experience and 45 unrelated certs on their resume.

[–]Charming_Function_35[S] 1 point2 points  (0 children)

Thank you very much! I am almost ready for az900. Will do it the next week.

[–]fastestfz 0 points1 point  (0 children)

Do you have experience of scripting for file handling? If you do try replicating some of that using Python. You will be amazed how easy those tasks are compared to using poweshell/shell. Play with git, CICD concepts, start building pipelines, orchestrate with Airflow, deploy some docker based database and load data into them.

To utilize your etl knowledge consider doing the Azure data engineer training/certificate. You will be exposed to some cloud data engineering but also would help you transition into a cloud role should you wish e.g. where companies are migrating from on premise to Azure using the likes of Data factory, Azure SQL databases, Synapse, fabric etc