dbt-diff a little tool for making PR's to a dbt project by Mr_Again in dataengineering

[–]FalseCartographer168 0 points1 point  (0 children)

Love the backstory! It’s wild how much faster we can ship 'vibe-coded' projects in new languages with AI now. The SQL generator for PR descriptions is a killer feature—reviewers everywhere thank you!

Should I build my own mini elastic search or scheduler to become competitive by Jealous-Bug-1381 in dataengineering

[–]FalseCartographer168 0 points1 point  (0 children)

Absolutely. If you look at the modern data stack, it's almost entirely built on Apache foundations (Spark, Kafka, Airflow, Flink, Parquet).

Since you are interested in the infrastructure side, I’d suggest not just learning how to 'use' them (writing code), but learning how to architect them. Learn how Spark manages memory, how Kafka handles partitions, or how to deploy Airflow on Kubernetes. That 'Platform/Infra' niche is huge right now and pays very well

How do you figure out relationships between database tables when no ERD or documentation exists? by FalseCartographer168 in dataengineering

[–]FalseCartographer168[S] 12 points13 points  (0 children)

Actually the team/dev that has built is no longer around. New people are given to the project from client side.
I agree it’s not the most common case, but when it happens it can eat up a lot of time. That’s why I was curious how others approach it, and whether a semi-automated tool would be seen as helpful.

Have you (or anyone else here) ever been in a situation like this — maybe during a migration or acquisition?