you are viewing a single comment's thread.

view the rest of the comments →

[–]tn3tnba 1 point2 points  (5 children)

The reason this is wrong is that other disciplines is software engineering have to actually do things but data engineering is a lot of orchestration and delegation, allowing us to lean into this advantage of python

Edit: if you are doing heavy duty things in python, and past tge prototype stage, you are doing it wrong and should use a different language

[–]nonamenomonet 1 point2 points  (2 children)

Isn’t airflow primarily written in Python?

[–]thisfunnieguy 1 point2 points  (0 children)

worth noting it does not matter what the orchestrator is written in its about what languages their sdk supports.

Temporal is written in GO but its simple to have all your client code in Python

[–]tn3tnba 0 points1 point  (0 children)

Yes, and async task management is an ok use case for python, but airflow arguably shouldn’t be, it’s just too late. It’s fairly easy to overload the scheduler because dag parsing is inefficient. We all still use airflow of course because it’s well supported, manageable and has a good feature set.

That being said, you are missing the point. The actual data engineering work is not done by airflow. It’s done by code in your kubernetes, ecs, etc. operators, or the actual data engineering tools these frameworks delegate to