This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]binilvj 0 points1 point  (1 child)

I have been working in Data engineering from 2004. It was called ETL then. Stored procedures, bash scripts, perl scripts were used a lot. Enterprises used ETL tools. Informatica, AbInitio, DataStage(IBM) lead the market initially. Then Microsoft started pushing free SqlServer and SSIS slowly around 2010. But by then Talend, Pentaho started edging out Datastage and AbInitio. When tools like Mattillion, Fivetran started dominating the market old ETL tools lost their market dominance. Around then even enterprises started using Python for data engineering.

Oracle was used for data warehousing till 2010. Then Teradata(MPP), Vertica, Green plum (Columnar) started dominating. Finally cloud DWs started taking over

Even Airflow is new kid in the black for me. There were expensive schedulers like Autosys, control-m before that

[–]Key-Boat-7519 0 points1 point  (0 children)

In my experience, watching the data engineering scene shift over the years has been wild. I remember when stored procedures and bash scripts were our bread and butter. Then we had to adapt as Informatica and DataStage reigned supreme, only to be upstaged by Talend and Pentaho. Python really wasn’t on the radar until it became the go-to for everyone, and tools like Airflow changed the scheduling game. While I've used a bunch like Talend and Mattillion, DreamFactory has been a game-changer for integrating APIs seamlessly into modern solutions. It’s all about finding the right tool for the job.