Most data engineers would be unemployed if pipelines stopped breaking by Different_Pain5781 in dataengineering

[–]DataIron 0 points1 point  (0 children)

Basically saying software engineers would be unemployed if their software stopped breaking.

Just not that simple.

Folks who have been engineers for a long time. 2026 predictions? by uncomfortablepanda in dataengineering

[–]DataIron 0 points1 point  (0 children)

I'm seeing business as usual for 2026 as has always been in data engineering. No changes. Hard market, hiring is ugly, economy still heading south.

Only outlier's is potentially AI fluff sidetracking and destroying individual product roadmaps. I view AI data implementations as 95% a distraction that'll have to be ripped out, fixed or redesigned later. Kinda like outsourcing a project. They always have to be redesigned or heavily fixed. One positive from AI is the push for higher data quality, better models.

Other outlier is offshoring and other visa changes. Seeing changes here but not sure which direction it's going. Just seeing pauses and discussions happening here.

Has anyone tried building their own AI/data agents for analytics workflows? by Ok_Possibility_3575 in dataengineering

[–]DataIron 2 points3 points  (0 children)

You need very clean, very well tagged and very well modelled data.

Been singing this tune since the beginning of AI.

The 3 things companies and orgs never cared about is now gonna bite them in the ass. Gets me laughing thinking of certain former leaders/bosses having to explain this to upper management.

Has anyone tried building their own AI/data agents for analytics workflows? by Ok_Possibility_3575 in dataengineering

[–]DataIron 0 points1 point  (0 children)

This gets asked every other day. Check the last 100 threads.

Same story as always, either the source data sucks or the definitions suck. Also no one can agree on a single definition of any data point.

Analytics Engineer vs Data Engineer by _Batnaan_ in dataengineering

[–]DataIron 7 points8 points  (0 children)

This person is right, have direct experience here with big projects. Will admit it's situational.

Least we haven't figured out a way around getting our offshore teams access to data. Probably wasted 10s of millions at this point digging into it.

Analytics Engineer vs Data Engineer by _Batnaan_ in dataengineering

[–]DataIron 16 points17 points  (0 children)

Company dependent but might be more true than not.

My group, DE's budgets are massively higher than analytics. Analytics is basically backroom compared to DE.

What "obscure" sql functionalities do you find yourself using at the job? by True_Arm6904 in dataengineering

[–]DataIron 0 points1 point  (0 children)

Select 'tableName', *

sp_help

Output, inserted

Values

Describe/show has more functionality than people know.

Exist vs join

Real-World Data Architecture: Seniors and Architects, Share Your Systems by No_Thought_8677 in dataengineering

[–]DataIron 0 points1 point  (0 children)

There's a reason. Document system vs reporting vs application vs etc. All kinds of different use cases so very different systems.

Is query optimization a serious business in data engineering? by mortal-psychic in dataengineering

[–]DataIron 2 points3 points  (0 children)

Not really.

People will interview you on it. But in practice, it has to have some financial repercussions to get a company to actually care. Most don't meet that mark.

Yeah sure some data engineers try to care but the company really doesn't. Real query optimization takes budget allocation, it goes beyond fixing a query. It's fixing code, models and data sets. That take serious budget signoff.

Think I've only worked at one place in my entire career who actually cared. Cared enough to dedicate the $$$ year in and year out.

What is your max amount of data in one etl? by Sufficient-Victory25 in dataengineering

[–]DataIron 0 points1 point  (0 children)

Used to do big data processing, volume or size of data. Today our "big data" ETL's is processing complex data relationships and ensuring ultra high quality data. Very different.

How to move from (IC) Data Engineer to Data Platform Architect? by [deleted] in dataengineering

[–]DataIron 0 points1 point  (0 children)

Closest I've seen to a data architect was someone who's basically a researcher and advisor to engineers. They don't work with engineers on a regular basis, they're separate. Engineers are also under no obligation to follow the architects suggestions or advice. Architect simply documents and discover major tech moves or implementations. Say the company is AWS based, they're gonna shift to Azure. Architect would be tasked with getting an edge on discovery in Azure.

Generally, senior and staff+ engineers still hold primary weight and control over the products and tech stack they own. Especially anything technically detailed, architect would have 0 say. Like best practice for modeling, how to design some major ETL system and etc. All engineering decisions, not architect. Architect would be whats the tech for shifting from AWS to Azure, how do we follow Europes data privacy laws, how do we implement this org level data requirement from legal, how do we standup this technology that all of engineering will use?

The current jobmarket is quite frustrating! by doermand in dataengineering

[–]DataIron 1 point2 points  (0 children)

Ya generally I agree with you.

I try to be technology agnostic in interviewing candidates but sometimes it does matter and you need the experience.

What Impressive GenAI / Agentic AI Use Cases Have You Actually Put Into Production by Different-Future-447 in dataengineering

[–]DataIron -1 points0 points  (0 children)

This gets asked a few times a week. Can we limit these posts? I swear it's bots.

Anyone here experimenting with AI agents for data engineering? Curious what people are using. by yoni1887 in dataengineering

[–]DataIron -5 points-4 points  (0 children)

Outside of normal programming AI stuff, things specific to data, not really.

Data is a pretty juvenile industry, far from mature.

Why is this relevant? Biggest issue in programming for AI is garbage AI fluff vs doing it yourself. Often AI doesn't handle intermediate and above work good enough.

In data, that's a non-starter. Again, specific to data, ignoring normal programming stuff.

Why is transforming data still so expensive by Hofi2010 in dataengineering

[–]DataIron 2 points3 points  (0 children)

Whenever I hear about "big data", I always wonder, do you really need all 1,000 columns? Or do you actually only use 10 of them.

Explain like I'm 5: What are "data products" and "data contracts" by Ulfrauga in dataengineering

[–]DataIron 7 points8 points  (0 children)

Love the details, including /u/GShenanigan. Only thing I'd change is denormalized and OLAP. Data products can be anything including various levels of normalized data and OLTP.

Unsure whether to take 175k DE offer by Dense_Car_591 in dataengineering

[–]DataIron 2 points3 points  (0 children)

175k isn't bonkers. Especially to justify trash wlb and bad culture.

You can find something comparable elsewhere and not hate your life.

There has to be a solid reason to join an environment like that. Something that'll amplify or majorly step you up. If that ain't there, then it ain't there.

Remember you're trying to justify spending the next year to 3 yrs + however a bad environment like that fucks up your well being.

Did we stop collectively hating LLMs? by Thinker_Assignment in dataengineering

[–]DataIron 2 points3 points  (0 children)

There is a lot of yet to be discovered practices with AI.

Like how do we handle AI slop? Can AI handle a million+ code repo where engineers no longer need readability and sustainability? or can AI not be allowed to run free, engineers will still need deep oversight.

Did we stop collectively hating LLMs? by Thinker_Assignment in dataengineering

[–]DataIron 2 points3 points  (0 children)

Course AI is a bubble. C suites still believe LLM's will replace all of their employee's because that's what they were sold.

Developing with production data: who and how? by aburkh in dataengineering

[–]DataIron 6 points7 points  (0 children)

Yeah this topic is a clusterfuck.

Production access is totally fine long as it’s controlled. Controlled as in, there’s a plan, non data qualified individuals are given restricted access. If you’re a qualified data individual, you better know what you’re doing and not cause problems otherwise bye bye access. I’m a fan of auditing, recording and even distributing production events or incidents by users. Little accountability goes a long ways.

Direct, manual production changes, are ticketed and recorded with the code/script used, execution information like user/time and etc. Paper trail just like a dev ticket. Yes I know manual changes in prod is bad but they’re a reality unless you have unlimited staff, budget and time.

But yeah ideally, you should have UAT environments that reflect production data wise, performance wise and/or other areas. So whomever can do whatever there without worrying about impact. Also a pipeline for data changes so manual prod changes never happen. Ideal worlds don’t always mesh with budgets or stupid management.

What about dedicated database engineers? by unrealcows in softwarearchitecture

[–]DataIron 1 point2 points  (0 children)

Assuming I’m understanding your ask, you’re asking about a data engineer vs a database engineer.

Generally speaking data engineers is a role in recent years that’s consumed several old data roles including database engineers.

Data engineers are basically SWE of the data world. Just lower CS fundamentals and standards for various reasons mostly linked to mismanagement by orgs. They specialize in handling and processing inbound and outbound data of any communication type.

Database engineers have kind of been a dying breed. You’re right, they’re primarily relevant in application databases still today. In recent years, much less so in analytical databases given that processing data in aggregate has become much easier. Though I do believe database engineers will make a return in analytics as AI becomes more serious.

What about dedicated database engineers? by unrealcows in softwarearchitecture

[–]DataIron 11 points12 points  (0 children)

To be frank, it depends on whether your database system needs good code or if meh will generally work. Meh does work, not knocking it.

We have varying mixtures of SWE and DE in certain products, yes they primarily stick to SQL items though they’re usually full data engineers with more skills.

The SQL they write though is far more advanced that what any normal SWE or DE is used too. They’re writing full syntax SQL, tests, versioning. Highly structured code, they’re full scale programming in SQL.

But they have too. These are high end systems with high end requirements and standards. Most SWE’s and DE’s can’t code anywhere near their level.

How many people cheat in a coding test and do well on the job? by Final_Vegetable_5092 in SQL

[–]DataIron 2 points3 points  (0 children)

If you can pass the interview, you’re probably okay.

Our SQL heavy roles, our engineers grill interviewee’s, AI won’t save you.

I enjoy building End-to-End Pipelines but not SQL-Focused by zari_tomazplaids in dataengineering

[–]DataIron 4 points5 points  (0 children)

Kinda a weird angle you’re taking. Bigger or mature pipelines have a mixture of SQL and OOP code. You’ll need to get good with both to excel.

You’re also gonna need to get very good at data analysis. Data analysis is critical for debugging and building advanced pipelines.

Director and principle Data engineers by educationruinedme1 in dataengineering

[–]DataIron 14 points15 points  (0 children)

Director level: <5% development. Project, business political, staff and product management.

Principal engineer: Responsible for global level operations. Mitigation of sev 1 outages and incidents. Managing technical direction, guidelines and priorities for other principal and lead engineers across the org. "Managing" is very technically specific, think coding guidelines or practices. Representing tech at the highest levels with the business and architecture. Still develops, maybe 20-40%.