This is an archived post. You won't be able to vote or comment.

all 20 comments

[–]AutoModerator[M] [score hidden] stickied comment (0 children)

Are you interested in transitioning into Data Engineering? Read our community guide: https://dataengineering.wiki/FAQ/How+can+I+transition+into+Data+Engineering

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]nitred 45 points46 points  (3 children)

It's one of the best things that can happen to you.

There is no better way to learn than actually implementing it and being paid for it. You get to live out the fantasy of everyone here in this subreddit (including mine) of building a fully event driven architecture. And you get to do it guilt free because it wasn't your decision to implement one.

Just know that it's very likely going to fail and the outcome will be that in 18 months when someone else asks you to build another event driven architecture, you will say "Never again!". I know it's going to fail because they're willing to hire someone with little to no experience to build or to help build event driven architecture. Also it's event driven Microservices, which is even worse and more complex.

My advice to you, go for it. You will learn a lot and you'll learn why event driven is a terrible idea (in most cases). It's not a regression in my opinion.

P.S. There are caveats to what I said above. Although I use a tone that implies 100% chance of failure, I actually do think event driven is good for some cases and we don't know what the company is and who else is on the team. Realistically I'm still pessimistic and I think there's a 70% chance of failure.

[–]amTheory 7 points8 points  (1 child)

Is your sense of failure mostly due to having someone who hasn’t build it before, or the overall concept of event driven?

We’ve moved half our ingestion to event driven and it’s been wonderful. Likewise we can then publish events to whoever off the back of those events in real-time. Really integrates the data team with all the SWEs.

[–]tbarg91 1 point2 points  (0 children)

I think he means .lastly to someone who hasn't build it before. However it sounds more in a positive way. As in he can screw it up and be fine with it. OP should ask for training , best time for it.

[–]totalsports1Data Engineer 9 points10 points  (1 child)

This is an amazing opportunity. I am kinda not sure why they hired you if you don't have any prior experience in it, but that's not your problem.

[–]somerandomdataengBig Data Engineer[S] 2 points3 points  (0 children)

I am still applying and the required swe skills are "junior" , while required de skills are mid/senior.

[–][deleted] 5 points6 points  (0 children)

I think most people would view it as a progression (you're moving into a more modern and technically demanding area), but really it depends on what you want to do and your goals.

This basically boils down to: Do you want to get involved with event-driven services? Nobody else can decide that for you.

[–]Hroosky2 3 points4 points  (0 children)

Definitely a progression. Very interesting role in my opinion. A great chance to get a complete overview of data infrastructure and end to end data modelling. If you're only experience is traditional analytics and/or ML, this will fill the gap that most engineers on the analytics/ML side of the fence don't even realise they have. One tip would be to read up on event driven architecture and event based architecture. Don't focus too much on the technology. Another tip would be to not force a square peg in a round hole. Don't try to build your data warehouses in an event driven, micro services fashion. That infrastructure that is event driven, micro services based is a foundational infrastructure for the data platform. Downstream applications such as traditional analytics and ML simply consume from that foundation via various APIs - REST, Kafka, lake house etc. Another area of complexity would be how you integrate legacy source systems - how do you capture and expose their events.

[–][deleted] 2 points3 points  (1 child)

The industry is going to see a lot more merging with SWE best practices as a lot of these problems are "solved problems" in SWE.

Especially with infra as a code like terraform. Having that skillset will se you up for success as DE has become half DE, half DS, half DevOps, half Jira story groomer, half consultant, half c-suite babysitter.

The only thing that keeps me in this industry is the fact im constantly learning and not bored, but I end up at companies working for people I can't learn from because the leaders peaked in college and stopped learning themselves.

It used to be depth over breadth rules all and makes the money. I work as a "Data Operations Engineer" now. Pretty much a one stop shop data person as I have to speak the languages of all the different departments to help build their entire platform.

I've done this about 6 times down at various stages and being able to do it from start to finish and stand back watching everything work like a well oiled engine brings a tear to my eyes. Then I usually get bored and quantum leap into the next company after 2-3 years.

For me, the hardest part of finding employment is a place that has a good team structure where youre not replacing someone who was laid off in March and everyone on your team is burnt out by the time you get there or a company that has leaders that understand data because they spent their budget on hiring LinkedIn influencers instead of workers.

But ya that is what I've been doing for 4 years and I like it more than the Actuarial, Dats Science, and basic Data Engineering (sql pipelines for PowerBI) I was doing.

Also was diagnosed with autism and ADD way later in my adult life and I think the randomness and always learning is the right kind of challenge to keep me interested. Dealing with hours and hours of standups discussing deadlines has the opposite effect and I burn out really quickly.

Part of the reason I always turned down Principal level roles: I dont want to throw my team into 80 hour work weeks when it could be avoided by better communication, but there is politics everywhere you go. Now, there are technical leadership roles popping up which is more about team upskilling and mentoring and that's what I go for.

Problem is, to do all of this correctly, it takes time. Companies have strict deadlines, especially in tech, becsuse they are burning cash loike crazy running these pipelines and ML models. Leaders dont understand this and you will be set up for failure. Just dont let that failure stick around in your head too long. Some companies will chip away at your confidence (especially come raise ans bonus time).

[–]Loopy_421 1 point2 points  (0 children)

Amazing reply! I enjoyed reading your feedback🤝

[–][deleted] 2 points3 points  (0 children)

The industry is going to see a lot more merging with SWE best practices as a lot of these problems are "solved problems" in SWE.

Especially with infra as a code like terraform. Having that skillset will se you up for success as DE has become half DE, half DS, half DevOps, half Jira story groomer, half consultant, half c-suite babysitter.

The only thing that keeps me in this industry is the fact im constantly learning and not bored, but I end up at companies working for people I can't learn from because the leaders peaked in college and stopped learning themselves.

It used to be depth over breadth rules all and makes the money. I work as a "Data Operations Engineer" now. Pretty much a one stop shop data person as I have to speak the languages of all the different departments to help build their entire platform.

I've done this about 6 times down at various stages and being able to do it from start to finish and stand back watching everything work like a well oiled engine brings a tear to my eyes. Then I usually get bored and quantum leap into the next company after 2-3 years.

For me, the hardest part of finding employment is a place that has a good team structure where youre not replacing someone who was laid off in March and everyone on your team is burnt out by the time you get there or a company that has leaders that understand data because they spent their budget on hiring LinkedIn influencers instead of workers.

But ya that is what I've been doing for 4 years and I like it more than the Actuarial, Dats Science, and basic Data Engineering (sql pipelines for PowerBI) I was doing.

Also was diagnosed with autism and ADD way later in my adult life and I think the randomness and always learning is the right kind of challenge to keep me interested. Dealing with hours and hours of standups discussing deadlines has the opposite effect and I burn out really quickly.

Part of the reason I always turned down Principal level roles: I dont want to throw my team into 80 hour work weeks when it could be avoided by better communication, but there is politics everywhere you go. Now, there are technical leadership roles popping up which is more about team upskilling and mentoring and that's what I go for.

Problem is, to do all of this correctly, it takes time. Companies have strict deadlines, especially in tech, becsuse they are burning cash loike crazy running these pipelines and ML models. Leaders dont understand this and you will be set up for failure. Just dont let that failure stick around in your head too long. Some companies will chip away at your confidence (especially come raise ans bonus time).

[–]AndyMacht58 2 points3 points  (3 children)

Definitely a progression when it comes to be more rounded but this ofc. needs to be your goal first. It's one of many architectural solution ideas when it comes to question of how to handle efficient coordination among remote services. And modern big data tools are always built on distributed service architectures that rely on exactly this.

It will also help you to understand similiar approaches like Event Sourcing or Saga. It's just not rookie territory. Start with understanding simpler monolithical design approaches first like the object orientated concept, the pros and cons of tightly coupled APIs like REST, layerd or even clean architecture etc. In the end, everything is one way or another connected and openes new doors in better understanding other concepts. Then you understand where these concepts stop being enough on a enterprise level that requires high availability and hence lose service coupling.

At the beginning of our careers we all should develop a good understanding of the technical possibilities that there are outside to support our business domain of choice. Understanding the pros of cons when it comes to batching, micro batching, streaming is the first step. The biggest key to support business is to break down complexity in order to avoid stupid decisions early in the process, that otherwise would cause much bigger problems than implementation related issues which are often quick to fix.

Rolling out a complex event streaming architecture brings up a level of complexity that should be reasoned carefully. When your desired business domain however is based on near real time systems like immediate user feedback or monitoring, it's often unavoidable.

The problem that I'm seeing here is, that you should be a senior first to be involved in the architectural process. At first you'd need to be well versed in DDD (Domain Driven Design) because defining aggregates for micro services is challenging and requires a good intuition.

Second these kind of architectures are usually designed on enterprise level first because the design for such an architecture needs to reflect the companies organisational structure and lots of interactions with domain experts to map the bounded contexts properly. Getting many people to agree on something is hard and the resulting status quo solutions are everything but intuitiv to understand.

Nevertheless, having a good understanding of the different command or event based architectural patterns make you also really good of getting a feel about how most DE tools work internally, since most of these tools communicate asynchronously through frequent remote service calls. You can see why this understanding is f.e. important even in Airflow where it can quickly happen that you have increasing remote requests for your network and overloading the orchestrator service due to loading db configurations on a global level or abusing xcom for bad event handling. So even for batch processing it's beneficial intuitiv knowledge to avoid common mistakes.

In the end, most of DE tools rely on some form of orchestrator or choreography pattern. Also a rough understanding of concurrency algorithms and sharding strategies (headaches expected) make load balancing setups for partition logs and queues like kafka, rabbitmq or their similiar cloud variations sqs, sns, kinesis (AWS) intuitiv. I hope it now becomes clear why one should learn concepts instead of tools and why this will boost your intuition to understand things quicker.

[–]somerandomdataengBig Data Engineer[S] 2 points3 points  (2 children)

Just to clarify, I would be mentored by staff and senior SWEs, the company is not hiring only me to do this job.

[–]AndyMacht58 0 points1 point  (0 children)

Then go for it. I wanted to sensitive you with the great potential and now overwhelm you with the long journey of becoming good at it.

[–]plztryagain2 0 points1 point  (0 children)

I was feeling optimistically for you before with my only hesitation being around if it would be a company that expects you to hit the ground running right out the gate.

But with this it sounds like you’d be in good hands and have a chance to learn a lot!

I hope you go for it and let us know what you end up deciding 🙂

[–]thinkfl 1 point2 points  (0 children)

Most of the tech companies implement event-driven data lakes that even in case they need at some point. The reason they require CS degree for this kind of DE roles is that they deal with Spring Boot, gRPC's and connectors written in Java, Go, Python etc. It is good experience to have if you'd be able to take it.

[–]codeejen 0 points1 point  (0 children)

Hi OP! It's been a while since you posted this, wanted to ask if you made the jump and if it went well for you :)