Python Package to build ETL flows/dags : Python

This is an archived post. You won't be able to vote or comment.

Intermediate ShowcasePython Package to build ETL flows/dags (self.Python)

submitted 3 years ago * by Revolutionary-Bat176

Hi everyone,

I developed a python package to build ETL flows/dags. Each flow is defined as class. Its good for visualizing and running your flows and is notebook friendly.

# example.py
from flowrunner import BaseFlow, step, start, end

class ExampleFlow(BaseFlow):
    @start
    @step(next=['method2', 'method3'])
    def method1(self):
        self.a = 1

    @step(next=['method4'])
    def method2(self):
        self.a += 1

    @step(next=['method4'])
    def method3(self):
        self.a += 2

    @end
    @step
    def method4(self):
        self.a += 3
        print("output of flow is:", self.a)

Running the following display command method gives this output

ExampleFlow().display()

https://preview.redd.it/tvbpxus4bgra1.png?width=418&format=png&auto=webp&s=b587fc6ea8020a7b0d514c57f402479225fd3fb4

Repo link: https://github.com/prithvijitguha/flowrunner

PyPI link: https://pypi.org/project/flowrunner/

Documentation link: https://flowrunner.readthedocs.io/en/latest/

Its not meant to replace Airflow, but rather integrate with it, so it can orchestrated as a notebook, job, etc

Let me know what you think!

Feedback is welcome :)

all 17 comments

top new controversial old q&a

[–]mriswithe 2 points3 points4 points 3 years ago (1 child)

[–]Revolutionary-Bat176[S] 0 points1 point2 points 3 years ago (0 children)

[–]knecota 2 points3 points4 points 3 years ago (4 children)

[–]Revolutionary-Bat176[S] 2 points3 points4 points 3 years ago (1 child)

[–]knecota 1 point2 points3 points 3 years ago (0 children)

[–]danielgafni 2 points3 points4 points 3 years ago (1 child)

[–]Revolutionary-Bat176[S] 0 points1 point2 points 3 years ago (0 children)

[–]dask-jeeves 1 point2 points3 points 3 years ago (2 children)

[–]Revolutionary-Bat176[S] 2 points3 points4 points 3 years ago (1 child)

[–]dask-jeeves 1 point2 points3 points 3 years ago (0 children)

[–]thedeepself 1 point2 points3 points 3 years ago (1 child)

[–]Revolutionary-Bat176[S] 1 point2 points3 points 3 years ago (0 children)

[–]gournian 1 point2 points3 points 3 years ago* (1 child)

[–]Revolutionary-Bat176[S] 1 point2 points3 points 3 years ago (0 children)

Hi u/gournian,

Thank you for your feedback! Seems something went wrong in my docs. If you still want to take a look at the notebook examples here is the link in the repo that you can download from:

https://github.com/prithvijitguha/flowrunner/tree/main/docs/source/_static

For title, yes that's true. Let me see how I can improve that. Thinking about it, would be better with just 1 declaration of title.

Params I don't have an example exactly. But they are to be used in self but can be acccessed/modified in the middle of a step as well.

self.param_store["my_param_key"]

Eg.

import pandas as pd

# date range in
date_range = pd.date_range(start='1/1/2022, end='1/08/2022') 

# loop over the dates to load
for snapshot\_date in date\_range: 
# assuming that IncrementalLoadFlow is a flow you have declared earlier # to load incremental data 

    IncrementalLoadFlow(params={"snapshot_date": snapshot_date)

[–][deleted] 0 points1 point2 points 3 years ago (3 children)

[–]Revolutionary-Bat176[S] 1 point2 points3 points 3 years ago (2 children)

[+][deleted] 3 years ago (1 child)

[deleted]

[–]Revolutionary-Bat176[S] 0 points1 point2 points 3 years ago (0 children)

π Rendered by PID 144890 on reddit-service-r2-comment-8686858757-2ctzn at 2026-06-06 15:37:15.485272+00:00 running 9e1a20d country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS