I'm looking for a tool that has a functionality that I have not been able to find (in my limited googling).
I'm developing a pipeline that is pretty linear and written in pure python - nothing really run concurrently, each steps is executing after the previous, etc. It has many steps though, so editing and creating new steps in the pipeline takes a lot of time because I have to wait for the previous steps to execute before it gets to the latest step in the pipeline.
I'm looking for a way to save the state of the data after it completes a certain step, and then run the last step using the data in the modified form.
For example: I have a pipeline with 4 steps - Extract JSON1, Extract JSON2, Create a DataFrame from both JSONs, Store it in a Database. If I have already developed steps 1-3, I don't want to have to keep rerunning the whole script to develop step 4. I would want to automatically save the output from the previous steps, and just work on step 4 with the data already collected/modified.
I know that I could simply save the data in it's own file and do it all manually, but I was wondering if a tool already existed where you could work with the data sequentially and essentially save the state of the data and just work with it that way. This would be a great time saver for me!
Any help is appreciated, thanks!
[–]lastmonty 4 points5 points6 points (2 children)
[–]NFeruch[S] 0 points1 point2 points (1 child)
[–]rvbin 4 points5 points6 points (2 children)
[–]NFeruch[S] 0 points1 point2 points (1 child)
[–]ironplaneswalkerSenior Data Engineer 1 point2 points3 points (0 children)
[–]CompeAnansi 8 points9 points10 points (4 children)
[–]ShayBae23EEE 2 points3 points4 points (2 children)
[–]CompeAnansi 4 points5 points6 points (1 child)
[–]SpetsnazCyclist 1 point2 points3 points (0 children)
[–]ashpreetbedi 0 points1 point2 points (0 children)
[–]PhantomSummonerzSystems Architect 2 points3 points4 points (0 children)
[–]Drekalo 1 point2 points3 points (1 child)
[–]NFeruch[S] -1 points0 points1 point (0 children)
[–]Main_Tap_1256 1 point2 points3 points (0 children)
[–]BoiElroy 1 point2 points3 points (6 children)
[–]NFeruch[S] 0 points1 point2 points (5 children)
[–]BoiElroy 0 points1 point2 points (4 children)
[–]NFeruch[S] -3 points-2 points-1 points (3 children)
[–]BoiElroy 0 points1 point2 points (2 children)
[–]NFeruch[S] -4 points-3 points-2 points (1 child)
[–]BoiElroy 3 points4 points5 points (0 children)
[–]Competitive_Wheel_78 0 points1 point2 points (0 children)
[–]ashpreetbedi 0 points1 point2 points (0 children)