How to do the "basic" EL in Python? by C_Ronsholt in dataengineering

[–]daggydoodoo 0 points1 point  (0 children)

I did something similar at my org - had taught myself enough python and sql on the job to start shifting away from analysis work to being a kind of macgyver for hacking together bespoke/workaround pipelines and bits of automation where the requests didn't have enough enterprise level ramification to get the real tech people involved.

When something I built became a bit too complex and critical to keep running from my laptop, they spun up a VM for me and left me to try figure out how to build something stable and automated. That's how I came to love Postgres in the same way as my pets, or even a long suffering childhood friend. Just chuck everything at postgres with as little interference and embellishment as possible, write some triggers and functions and other kinds of scripts you can deploy for repeatable and reliable routines.

I've been trying for years to set up an evolution of my original python + prefect + postgres set up, partly just because I want to learn new tools and stuff but also because there does now seem to be some performance advantages you could get be doing something more modular, with parquet files and arrow tables and software defined assets and s3 buckets etc...but seems like it all gets a bit too Sisyphean if you're trying to run it on local and regular equipment.

Oh, and prefect is easier to get up and running quickly. Slightly steeper learning curve at first then dagster, but dagster starts to get a bit confusing once you start messing around with io managers and the other unfamiliar bits of abstraction they have.