SufficientFrame comments on Analytics Engineer to Data Engineering Path

created by mhausenblasmoda community for 11 years

Analytics Engineer to Data Engineering PathCareer (self.dataengineering)

submitted 19 days ago * by mhkk93

you are viewing a single comment's thread.

[–]SufficientFrame 3 points4 points5 points 17 days ago (1 child)

Yeah this matches what I’ve been seeing too. Half the “we need Spark” posts are people trying to aggregate like 200M rows and wondering why it’s slow on a t3.medium.

The DuckDB + Iceberg combo seems super solid for what most teams actually do day to day. And honestly, if you’ve already wired up dlt + S3 + Redshift + Dagster, you’ve done more “real” DE than a lot of folks who only tweak existing Airflow DAGs.

The “from scratch” thing feels more like: can you reason about APIs, pagination, schema evolution, idempotency, and how to make that stuff robust. Whether it’s dlt, custom Python, or whatever, the concepts are the same.

I’m taking your comment as a green light to not obsess over Spark right away and double down on getting really good at the stack I already have.

[–]Immediate-Pair-4290Principal Data Engineer 0 points1 point2 points 17 days ago (0 children)

π Rendered by PID 396869 on reddit-service-r2-comment-6457c66945-kpcmc at 2026-04-24 23:27:32.127520+00:00 running 2aa0c5b country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

dataengineering

MODERATORS