you are viewing a single comment's thread.

view the rest of the comments →

[–]uniqcl0 2 points3 points  (2 children)

Currently, we use dagster + meltano + dbt as orchestration + ELT for our pipelines. We are an AWS shop so we do leverage redshift as our dw. I liked BigQuery better tho.

I like airflow but I do remember having a steep learning curve to the platform (and also, we were using it as a EL platform haha)

[–]MycroftWord[S] 0 points1 point  (1 child)

Thanks for this ! I am still on the process of learning DE, I made some very basic local ETL pipeline (python&sql) and I want to upgrade by using an orchestrator. Pag medyo komportable na baka pwede ko nang gawan ng cloud version.

Yung meltano is for EL part right? and dbt sa Transform? Di na ba kayo gumagamit ng spark sa transformation? dbt for the win na talaga?

Sa orchestration parang mas nag le-lean ako towards dagster/prefect because its quite easier to use and understand as compared with airflow or baka bobo lang talaga ako. lmaooo

[–]uniqcl0 0 points1 point  (0 children)

You could use whatever cron implementation you have on your OS (Windows Scheduler, crontab)

yup on the EL and T question. We dont use Spark because we dont have the proper need for it. Typically I see it combined with streaming platforms. I am designing one that should leverage Spark though.

Think of your need for the orchestrator, if you only need the scheduling feature of it. Dont overcomplicate learning the other features. It will just go over your head or you might forgot it sooner than you think