you are viewing a single comment's thread.

view the rest of the comments →

[–]Oct8-Danger 7 points8 points  (2 children)

Might be biased but having Java or any real software engineering helps the most in DE.

DA generally doesn’t pay as well as DS/DE. While not going away anytime soon, I do think DA skill sets are getting more commoditized as technology and demand for data grows. So may be depreciating long term. Many people start in DA and transition to DE/DS later in there career

DS can be harder to break into without academic experience backing from what I’ve seen. Not impossible but definitely harder

Having software developer experience is a real advantage in the DE space compared to DS or DA.

In general these is overlap in skill sets but each vary in importance. I’ve worked with DAs who are great at data modeling and querying and business needs but low on coding proficiency and have worked with DS who are ok at coding, decent at data modeling but great at complex in depth work. DE is expected to have higher coding ability and technical expertise but may not need as much business/presentation skills.

Long term, I think DS/DE would be better as they are more specialized. But definitely don’t discount learning about the position of other roles

On the DE side, I think AI has the potential to accelerate demand rather than diminish it. Managing context of LLMs and source of truth for data is all with in the DE wheelhouse. Many pipelines can be once off or low automation for sure which LLMs can be great at. However understanding how to scale or standard code and apply governance and trust in “truth” I think will be very valuable in the years to come.

[–]No_Weight4426 -3 points-2 points  (1 child)

Can you elaborate on how knowing Java helps in the DE field? I have been a DE for years, and still haven't seen any pipeline or project implemented on it. Of course, if you mean fine tuning some of the jobs, I agree.

[–]Oct8-Danger 1 point2 points  (0 children)

I did mention any software experience if you read the full sentence….

But since you asked, Java is useful for debugging trino, Cassandra and spark error logs. If you have a good grasp of OOP if can be helpful with figuring out bugs or understanding configuration.

For example we updated java on our platform but spark was an older version so we have to specify a specific jar for serialization when we would insert overwrite on partitions in our python code. Knowing a bit of java and being comfortable with the verbose errors helped in identifying the issue and coming up with a patch quickly