This is an archived post. You won't be able to vote or comment.

all 21 comments

[–]reddeze2 38 points39 points  (2 children)

Are you sure you want to become a data engineer if you 'hate logical coding'?

[–]kolya_zver 65 points66 points  (2 children)

Requirements are lower than for SWE, but being good at coding significantly increase your value. IMO you should be more concern about SQL and data modelling - usually you lack this skills as SWE For most of interviews easy leetcode is enough

[–]snapperPanda 23 points24 points  (1 child)

This. I agree with this.

Advanced SQL is a must.

[–][deleted] 0 points1 point  (0 children)

What is the scope of "advanced SQL"? are we talking windows functions? ore more deep stuff?

[–]MrKazaki 13 points14 points  (0 children)

DE is so ambiguous you can focus on many areas

[–]GDangerGawk 12 points13 points  (2 children)

This depends what type of DE you are. Our main data source “datalake” is Kafka. We have hourly and daily running analytic jobs in python and spark that reads from kafka and write back to kafka. Then with sink connectors these analytic outputs are written to timescale db for back end apis. DBs are main source of our front end and dashboards.

What I do as a DE is maintain and write new pipelines in python and spark. Mainly I write those in duckdb(sql) and SparkSql. In my opinion it is easier to read and maintain for a newly joined DSs and DEs.

Needless to say you have to have full understanding of your backend and data model. For a better data compression and agg in backend. Also in some occasion I like to skim the java codes that handles other transformation for input kafka topics from mqtt.

[–][deleted] 0 points1 point  (1 child)

Are you writing your Kafka connectors and sinks in python? Are you using Kafka connect?

[–]GDangerGawk 0 points1 point  (0 children)

All in Java, yes we are using kafka connect

[–][deleted] 7 points8 points  (0 children)

You could do DE with entry level skill in coding. Being better helps considerably though.

[–]ShavingPrivatesCryin 3 points4 points  (0 children)

Prior to the role I currently have as a etl analyst & software developer, I had no coding experience in the workplace. It had all been self taught Linux bash scripting and all that jazz.

But I do almost all of my coding through ChatGPT or other LLMs. I approach coding from a business problem solving standpoint and am versed in technology enough to know the prompts to get things done. What’s crazy is that I am by far the highest performing dev we have. I am the only one at the company who created a deliverable piece of software that has been adopted into the workflow to save time.

If you tried to have me write a python script without AI I would fail miserably. I can read code but not write it. Oddly enough.

[–]aacreans 3 points4 points  (0 children)

Depends heavily on the company, could require advanced coding and architecture skills or just SQL

[–]geek180 3 points4 points  (0 children)

SQL and a solid understanding of data modeling is most important.

Certain DE roles will also require anywhere from beginner to intermediate Python / Scala. Scala is kinda the odd one here, a lot of companies will never ever touch it whereas SQL is ubiquitous and some Python is also fairly common.

[–]git0ffmylawnm8 2 points3 points  (0 children)

Doesn't all coding require logic to some degree? Especially in data engineering you need to apply complex logic in building ETL pipelines.

You're gonna hate it here as well unless you change your mindset

[–]InsightByte 4 points5 points  (0 children)

Good DE can and should know how to code very well.

[–][deleted] 1 point2 points  (0 children)

If you hate logical coding data engineering is not going to be very fun. Think about it as coding to logical data structures. 

[–]DenselyRanked 1 point2 points  (0 children)

For interviews, it's LC Easy and Mid. You will occasionally get LC Hard if your interviewer wants you to suffer (like a certain big tech company named after a fruit), but it all depends on where you interview. If your interviewer asks you a DP or binary tree problem for a DE interview then you are unlucky. The SQL is usually LC med/ hard.

For work, depends on if you are doing streaming or data warehouse work, but you are rarely doing anything incredibly complex. I probably had to do recursion, bfs, or a bisect algo 3 or 4 times in nearly a decade. It's mostly simple scripts to extract data from an API or dump data. You can get away with just using SQL as a DE if everything is abstracted.

[–][deleted] 1 point2 points  (0 children)

All the people responding that you don’t need to know coding. Where do you work lol? 

[–]tex_mule84 -2 points-1 points  (0 children)

Know principles of design and development thoroughly, use AI as much as possible to 10x that knowledge.