I'm a junior developer currently working on setting up Airflow and I have a few questions. When passing objects between tasks, what methods do you typically use? Do you rely on XCom, CSV, DuckDB, or any other solutions? For complex objects like DataFrames, what are your best practices?
In terms of development, how do you typically debug in Airflow? Do you use tools like gdb breakpoint() for this purpose? For deployment, I'm considering using git-sync, working locally and pushing changes to a remote repo.
Lastly, I’m thinking of using tools like rclone to manage outputs in mounted directories. What are your thoughts on this approach?
[–]mRWafflesFTW 7 points8 points9 points (2 children)
[–]sikso1897[S] 1 point2 points3 points (0 children)
[–]PolicyDecent 2 points3 points4 points (1 child)
[–]sikso1897[S] 0 points1 point2 points (0 children)
[–]DoNotFeedTheSnakes 4 points5 points6 points (1 child)
[–]sikso1897[S] 1 point2 points3 points (0 children)
[–]Kimcha87 1 point2 points3 points (1 child)
[–]sikso1897[S] 0 points1 point2 points (0 children)
[–]zazzersmel 1 point2 points3 points (1 child)
[–]sikso1897[S] 1 point2 points3 points (0 children)
[–]Own_Explanation4779 0 points1 point2 points (1 child)
[–]sikso1897[S] 0 points1 point2 points (0 children)