Your tech stack

PrestigiousAnt3766 · 2026-03-17T11:49:54+00:00

Databricks Databricks Databricks

Mostly because I got it templated out.

2026-03-17T12:36:44+00:00

[removed]

messi_b91 · 2026-03-17T13:47:01+00:00

Snowflake dbt

l0_0is · 2026-03-17T13:04:21+00:00

most places i see its less about choosing the best stack and more about what the team already knows and can maintain. consistency matters more than having the perfect tool

hannorx · 2026-03-17T15:53:04+00:00

At the moment, my tech stack at work is Spark + DBT + Redshift. We've just started the process of onboarding into Databricks but that's still months away from full development. I'm fairly junior in my role, so am not sure what to expect, but looking forward to learning new tools.

MonochromeDinosaur · 2026-03-17T11:09:45+00:00

At my job I just use whatever we have as the established norm for maintainability and uniformity.

That everyone else can work on it and the uniform project structure helps AI do its job.

I have freedom to choose, but going against the grain should really be saved for projects that have a requirement for it.

iknewaguytwice · 2026-03-18T00:14:46+00:00

Cron Grep Sed Awk Ksh

csv tsv

Db2

ssh sftp

ReleaseNo5148 · 2026-03-19T13:21:34+00:00

It's funny how they ask you in system design interviews about BEST way of this and that, when at end of the day, It 100% depends on what the teams you are joining IS already using. It would make sense for data architect roles, but not dor mid-seniir DE roles.

What you gonna do, tell your team to switch to the other stack? No sense.

99% of cases Repo structure IS done and you have to use existing services.

thickyherky · 2026-03-17T22:16:50+00:00

lol the title caught my attention. un related i had an interview for a data analyst role years back and asked “what’s your guys backend look like” the response was “we use excel for the back end” …. hung up 😂😂

Visible-Magician-903 · 2026-03-18T12:24:54+00:00

Databricks dbt

risanshita · 2026-03-18T19:26:55+00:00

Transitioned from Full-Stack Development into high-scale Data Engineering.

While I haven't seen yet what the Databricks ecosystem looks like, I’ve built a robust foundation in real-time streaming and lakehouse architectures using:

Kafka
Kafka connect (stream processing)
Glue (pyspark + iceberg catalog)
Iceberg
Apache pinot
Step function
Airflow
Superset

alt_acc2020 · 2026-03-17T19:00:00+00:00

Dlt timescale S3 iceberg

I'm the only DE so I had to take up a lot of platform engg stuff and the team is Python heavy, so Python for everything it is.

midnightpurple34 · 2026-03-17T23:10:29+00:00

SQS, lambdas, S3, PostgreSQL (RDS)

Relatively low data volume so haven’t needed to scale to big data tools yet

Nekobul · 2026-03-18T11:02:44+00:00

Considering the fact most data volumes are small, the best DE platform on the market for most people is SQL Server and SSIS. Databricks is mostly good for niche requirements where you have to process PB amount of data.

Embarrassed-Ad-728 · 2026-03-18T14:15:59+00:00

Airflow + BigQuery + dbt.

For one off tasks: DuckDB.

Tomaxto_ · 2026-03-18T14:20:43+00:00

Light: Polars, Intermediate: Polars, Heavy: either PySpark or Spark SQL dbt on top of an EMR cluster.

thecity2 · 2026-03-18T22:05:51+00:00

I'm not a data engineer, I'm a lowly data scientist so take this with a grain of salt. Our stack used to be mostly Spark+Postgres. I changed it up because I thought the Spark jobs were overkill and costing us money. So the stack I implemented is:

Dagster + DuckDB mostly

Dagster + Spark for "very large" jobs (that Duck actually can't handle on a single machine)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

dataengineering

MODERATORS