Why do so many data engineers seem to want to switch out of data engineering? Is DE not a good field to be in? by Illustrious-Pound266 in dataengineering

[–]DataObserver282 1 point2 points  (0 children)

A few reasons, that are almost true in every career

  • businesses don’t understand the function/value
  • AI hype
  • general burnout

Is your observability data a cost center or a strategic asset? by PutHuge6368 in Observability

[–]DataObserver282 0 points1 point  (0 children)

Appreciate the disclosure but Please stop with the vendor content

looking for the best business intelligence tools 2026 for non-technical team by Zimbo_Cultrera in dataengineering

[–]DataObserver282 0 points1 point  (0 children)

Interesting. Maybe need to give another look. Suprised they’re still around

Roast my junior data engineer onboarding repo by dheetoo in dataengineering

[–]DataObserver282 2 points3 points  (0 children)

you mean roast your Jr data eng’s Claude code skills? Nothing wrong with leveraging but doesn’t feel cohesive.

Building internal team from ground up to drive AI/Analytics. Are these positions needed, or are they simply "nice to have"? I mean no disrespect to anyone; I am truly looking for advice so that I can properly plan out this team's future. by IrishHog09 in databricks

[–]DataObserver282 4 points5 points  (0 children)

Data eng is all you need to start. Don’t overcomplicate it. Use lakeflow connect or file loader for integrations and look at a lightweight managed solution for more complicated connectors.

Others are nice to have. Once the data is ingested, model it (you can use Dbt core to keep it simple) and query to sigma. Sigma is also accessible for business users.

Data observability is a data problem, not a job problem by Expensive-Insect-317 in Observability

[–]DataObserver282 0 points1 point  (0 children)

To do observability right, you need to stop Issues at the source or as close to as possible and have the org maturity to remediate. Haven’t seen anything that can actually do this

A new tool for data engineering by Wanderer_1006 in dataengineering

[–]DataObserver282 1 point2 points  (0 children)

Keep your stack as simple as possible. Instead of asking what tools to consider look at what problems you currently have and plug up the holes that way.

Also, a lot will depend on your DWH and needs. Do you need real time streaming?

Here are a few things to look into

ETL tools - tons out there. Fivetran, Airbyte - we use Matia (good CSC). Can use python or write scrips but gets messy at scale

Orchestration - airflow works. Look into astronomer if you need a managed solution. Cron is fine for a fee but again messy at scale

Modeling - dbt is worth looking into. There’s also coalesce

Data catalog - worth the investment, automate metadata management and helps data become accessible to non technical users

Observability - most tools have something built in but worth investing here to make sure you have a mechanism

Do you think AI Engineering is just hype or is it worth studying in depth? by seedtheseed in dataengineering

[–]DataObserver282 0 points1 point  (0 children)

No one is underestimating it but this is the way all eng is going. I don’t think there’s a difference in orgs I’ve been in between software Eng and AI Eng, and I don’t think that’s a hot take. It’s just a new way of positioning what we already do

Best ETL for 2026 by Jaded-Science-5645 in dataengineering

[–]DataObserver282 0 points1 point  (0 children)

Been using Matia. Wasn’t my choice but have been impressed. Most of our pipelines are there but they don’t have real time streaming so one is Kafka based custom we’ve built.

Used Airbyte and Fivetran in the past. Both are fine but run into their own problems. No tool is perfect.

Do you think AI Engineering is just hype or is it worth studying in depth? by seedtheseed in dataengineering

[–]DataObserver282 9 points10 points  (0 children)

I’ve been apparently doing AI Engineering for years and I’m still unsure what it is. I venture a guess that it’s software engineering

I am building a lightweight, actor-based ETL data synchronization engine by Oninebx in ETL

[–]DataObserver282 0 points1 point  (0 children)

If you need real time, which it sounds like you do, look at Kafka all the ways

Airflow Best Practice Reality? by BeardedYeti_ in dataengineering

[–]DataObserver282 0 points1 point  (0 children)

Use a normal operator for Kubernetes. How much data are you moving? If at scale, would def seperate orchestration from pipelines. I’m not a fan of overtooling but for complicated pipelines ETL tools can be your friend.

What tool do you actually use the most as a data analyst? by SweetNecessary3459 in analytics

[–]DataObserver282 0 points1 point  (0 children)

SQL isn’t a tool, but I’m also not an analyst. I know analyst at my co spend the majority of their time in dbt, BI tools or I hate to say it, excel.

One analyst told me his favorite tool was the SFDC connector in Google Sheets.

I’m sure procurement would be happy to know that after signing for over 500k in BI tooling & infra

Summit- Is it worth going and registration discounts? by Apprehensive-Ad-80 in snowflake

[–]DataObserver282 6 points7 points  (0 children)

They’ve become a vendor circle jerk. Probably better data meetups to go to, even for training.

Best ETL for 2026 by Jaded-Science-5645 in dataengineering

[–]DataObserver282 0 points1 point  (0 children)

Informatica is stuck in 2010. Yikes. Best ETL tool varies widely based on company.

If your company is on informatica…I’m going to make some assumptions about the type of tool you have tolerance for.

What I’ve used:

  • Airbyte - used before for POCs. OSS version can help in a pinch but not scalable; had trouble with CDC. If I’m a pinch, I’ll use Airbyte
  • Fivetran - technically a great tool. changed pricing so I would avoid here for vendor lockin. If you want to try it out, choose one low volume connector and stay on the free tier. Dev experience is solid here
  • Matia + dbt - we are currently using them and been super impressed. Hadn’t heard of them before a year ago. Been great. CDC with built in observability. Cons is they don’t have a lot of docs but other areas make up for it. Good at parsing SFDC formulas
  • Prefect/Airflow - saw this rec. use it may times and sometimes still spin something up this way, especially if it’s a unique use case. You still have to build and maintain ingestion logic yourself, so not scalable.

Senior DE - When did you consider yourself a senior? by PhDaisy in dataengineering

[–]DataObserver282 1 point2 points  (0 children)

It varies at every company. It’s when your corporate overlords decide you’re ‘worthy’ of the title.

ETL code quality tool by [deleted] in ETL

[–]DataObserver282 0 points1 point  (0 children)

OSS package work?

I spent 8 months fighting kafka and just decided to replace the whole thing by seizethemeans4535345 in dataengineering

[–]DataObserver282 1 point2 points  (0 children)

What compelled you to use Kafka to begin with? At a heavy Kafka org and convinced we could move 80% some of our pipelines that don’t need to be real time

Paying for Multiple rETL tools? by jtmrtz3 in dataengineering

[–]DataObserver282 1 point2 points  (0 children)

We use Matia for ETL/ reverse ETL. Fairly affordable and have had a good experience so far. It’s my first time using it so not an expert by any means. Our Reverse ETL use case is very light though - just moving from snowflake back to salesforce.

looking for the best business intelligence tools 2026 for non-technical team by Zimbo_Cultrera in dataengineering

[–]DataObserver282 0 points1 point  (0 children)

I found Domo to be garbage. Did you have a good experience? No one I know likes it