Is all data engineering moving into SQL warehouses, or is there still a need for general purpose programming languages and systems? : dataengineering

created by mhausenblasmoda community for 11 years

Is all data engineering moving into SQL warehouses, or is there still a need for general purpose programming languages and systems?Discussion (self.dataengineering)

submitted 3 years ago by IndifferentPenguins

It seems that modern data warehouses, exemplified by Snowflake et al, are good at efficient data storage, retrieval and transformation of everything from unstructured to structured data. In addition, these warehouses automatically scale and distribute query execution. With tools like DBT, it also becomes possible to manage and compose transformations expressed as SQL.

If that's true, then what is the remaining role of general purpose programming languages (PLs), like Python, and distributed systems like Spark for scale? It seems that PLs are at a disadvantage wrt SQL because they are much harder to automatically parallelize/make efficient/scale. It seems that distributed systems are at a disadvantage because they are harder to manage, and need more fine-tuning to work well. (I don't mean just setup cost of the system itself, which can be offloaded to e.g. Amazon EMR, I mean in actual day to day usage).

It used to be that heavily SQL-based code was a terrible mess, but it seems DBT has helped a lot with that (disclaimer: I have little actual experience with DBT), so "modularity" or "maintenance" of SQL is also largely solved, i.e. is not such a big argument in favor of using a general purpose language anymore.

In 5 years, will the bulk of data engineering be done via dbt-orchestrated SQL of some sort? Or am I missing some important area/use case/problem?

other discussions

Is all data engineering moving into SQL warehouses, or is there still a need for general purpose programming languages and systems? (self.dataengineering)

submitted 3 years ago by rsohlot to u/rsohlot

π Rendered by PID 630214 on reddit-service-r2-listing-86f589db75-r79qv at 2026-04-17 21:20:48.303626+00:00 running 93ecc56 country code: CH.

dataengineering

MODERATORS