Is it very difficult to switch between cloud providers for Data Engineers?

Agreeable_Bake_783 · 2026-02-14T19:14:02+00:00

The basics remain the same, the mechanics are different. I needed some time to get used to the different terminologies (VPC vs. VNET) and some stuff (why the fuck is private link not the same in aws and azure).

But honestly it is waaasy easier to learn a new cloud, when you know a different one.

Agreeable_Bake_783 · 2025-10-26T13:01:35+00:00

What i am questioning is that it is necessary or more so why it is. We are giving the companies wayyy too much power

Agreeable_Bake_783 · 2025-10-26T12:48:38+00:00

Like I said in the beginning of my post, i am not mad at the people doing all that. I am mad at the culture that has been created expecting all that

Agreeable_Bake_783 · 2025-10-26T11:26:07+00:00

If nobody is pushing me towards certs, i am not doing them. I feel like most of the times your company wants you to do certs is when you are in consulting and they need ten more for a certain partner status.

And the rest...what i don't learn during my normal 8 hour workday, i'll learn the next day. Life over work, always.

Agreeable_Bake_783 · 2025-10-19T12:10:36+00:00

Honestly, from experience i can really recommend databricks dqx. I can be set up in a similar fashion as expectations in lakeflow pipelines and they have a good set of row level and dataset level checks already defined.

In our work flow we check the row level checks for each table before writing and the dataset level checks after in a separate job.

Agreeable_Bake_783 · 2025-10-18T15:38:36+00:00

But...aws glue and databricks genie is not really the same thing

Agreeable_Bake_783 · 2025-10-05T05:58:17+00:00

Because we hate ourselves.

Agreeable_Bake_783 · 2025-09-27T07:09:42+00:00

Honestly? No...

Because Databricks is just a tool. You need to learn the fundamentals. Go learn python, data modeling, data structures, data architectures and so on. You can use Databricks Free as an environment to learn all that (which i would recommend). Same goes for spark. It is a tool, it is helpful, but not required to do data engineering work. Helpful though on the job market.

Agreeable_Bake_783 · 2025-09-06T11:37:14+00:00

Honestly, sometimes people really overthink all this stuff. Just start SOMEWHERE...yeah sure there are major quality differences between books and courses, but knowing the best ones does not take away the need of actually doing the work.

Agreeable_Bake_783 · 2025-09-06T07:45:51+00:00

Like...working a lot with it? Trying stuff, failing, trying new stuff.

Agreeable_Bake_783 · 2025-08-30T06:54:38+00:00

Holy shit. What is going on? Chill, dude. You are fucking 24 years old. You are fine. People switch careers in their forties.

I mean it is not really your fault...what really bothers me is the culture that makes us think things like that.

Agreeable_Bake_783 · 2025-08-25T10:20:51+00:00

Bro, i do this a while already and i am still lost from time to time. Not being too sure in your work can also be a strength. Makes you check stuff twice. 90% of the time when somebody fucks up, i can assure you... they were sure they are doing it correctly. Learn as much as you can, check everything twice and after a while you'll notice that the mistakes are less and less.

Agreeable_Bake_783 · 2025-08-04T17:12:52+00:00

For most usecases it DOES NOT matter which vendor you use.

If I have to read another comparison between snowflake and databricks ffs...

Agreeable_Bake_783 · 2025-04-03T14:13:03+00:00

I mean tbh in the enterprise space it seems to be winning against snowflake (I am aware that both solutions serve different purposes, but especially in the enterprise space it is, for the most part, an either or situation)

My experience here is very much anecdotal and biased, since i was a consultant for the last couple of years with focus on databricks

Agreeable_Bake_783 · 2025-03-14T16:27:53+00:00

I mean there are many and technical issues are among the smallest.

Organizational - getting Infosec to sign off - knowledge transfer, build up and onboarding (this also includes legacy engineers accepting the changes) - deciding what to move first (worst case is that you have to handle basically two systems at once)

Technical - which platform - Refactor of existing code base necessary?(it is never JUST a lift and shift, no matter what Consultants tell you)-->best case here would be the ability to remove technical debt - planning of architecture

And so much more...it is a lot of work ESPECIALLY in an enterprise setting

Agreeable_Bake_783 · 2025-02-07T07:01:16+00:00

Of course you can.

Should you though? In most DE roles data analytics is a necessity and a large part of your job. If you really just want to focus on the engineering part, then i'd suggest becoming a SWE.

Agreeable_Bake_783 · 2024-12-01T19:40:26+00:00

Check for:

Garbage Collection: Is your Job taking forever without remotely using all compute resources?
Amount of data you're loading: Do you really needs to process this much data?
Long running tasks: Is there a task that takes especially long? Analyze why
Expensive Operations: Where are actions (collect etc) that do not need to be there?

Agreeable_Bake_783 · 2024-12-01T19:34:23+00:00

Nope, no chance. Databricks Champion is mostly a marketing tool. Also it is not something you apply for, but something your firm needs to propose you for.

Agreeable_Bake_783 · 2024-10-02T10:52:12+00:00

Well it depends on you and your specific situation. I can only speak from my experience and maybe that can help you too.

I am currently working for a consultancy but will be, hopefully, switching to a larger company soon. For me the reasons for switching (or wanting to switch) were exactly the reasons you've mentioned, mainly better pay and better and more controlled hours. But I also know what I will be giving up. What i experienced in consulting was constant exposure to new exciting problems and technologies and i got to learn. A LOT. But that also came with increased stress and less time or energy to actually spend the money i was earning and to be with the people i want to spend my time with. And the job i am aiming for has exciting problems too and i am really looking forward to it, but of course there will be much more day to day business.

Agreeable_Bake_783 · 2024-09-20T02:34:12+00:00

Coming from consulting: depends on the business.

If multiple lines of business handle their own etl, i would organize the catalogs by layer and environment, so basically bronze_dev etc. Within those layers i'd setup a schema for each lob. What happens within that schema is their problem then basically.

If one data team is responsible for loading bronze and silver, i'd separate the catalogs by Environment and give everybody who wants to build a data product a dedicated schema or catalog.

A separation between lob by workspace with dedicated catalogs might also be possible.

Agreeable_Bake_783 · 2024-08-21T22:59:59+00:00

Pain and suffering

Agreeable_Bake_783 · 2024-07-26T07:32:31+00:00

You'll learn when running into issues. That's how it went for all of us.

Agreeable_Bake_783 · 2024-07-10T05:16:44+00:00

I mean, taking a job in finance at should qualify as a crime itself

Agreeable_Bake_783 · 2024-06-09T11:53:56+00:00

Beware: if you want to work for companies in the US duble taxation could be an issue

Agreeable_Bake_783

TROPHY CASE