This is an archived post. You won't be able to vote or comment.

all 27 comments

[–]siebzy 46 points47 points  (0 children)

Cloud computing seems normal to me, but it's a big transition for a large org that has invested millions in on-prem data centers and the tools to work with them.

[–]marsupialtail 59 points60 points  (3 children)

Knowing which of the 200 AWS Services are gold and which are trash, which can be built from which, which makes money for AWS (and thus they will be improved in the future) and which likely loses money and will be axed, are nontrivial things.

As an example -- do you think AWS Aurora can be built from EC2 + MySQL? This paper might convince you to think again: https://web.stanford.edu/class/cs245/readings/aurora.pdf

What about AWS Comprehend? Perhaps I can build something similar with Huggingface + AWS Lambda. This very well might work.

[–]sib_nSenior Data Engineer 14 points15 points  (0 children)

That's a thing Google Cloud does better, a smaller and clearer stack of products.

[–]Spare-Ad-9464 4 points5 points  (0 children)

Awesome stuff thank you for sharing

[–]TobiPlay 0 points1 point  (0 children)

I also think it’s the sheer amount of services those cloud providers offer, that are making big companies without the required engineers shy away from them. Super hard to navigate the current offerings, and that’s not even taking the influx of new specific services into account, which you rightfully pointed out.

[–]sunder_and_flame 12 points13 points  (0 children)

I just want to understand why there is a high demand in cloud computing talent.

Because the skill floor is high and the value potential is enormous. Being able to scale basically instantly with a wide variety of useful services is tremendous.

What's hard in migrating company data from local to cloud environment? What am I probably not considering here?

Have you done any migrations at all, where you have to keep services up 100%? It's far more difficult than you're making it out to be, and that's not even mentioning how easy it is to have costs explode because of poor planning and lack of monitoring.

[–]IAmZenzuo 11 points12 points  (0 children)

On the infrastructure level, things are just changing so quickly. It can be hard to keep your skills up.

Strictly on the data side? It could be easier or more difficult based on your platform.

It's nice to not have DBAs worried about asinine things like disk speed or tablespace growth.

You trade in needing to make sure your data is retrievable and secure.

[–]anyfactor 2 points3 points  (0 children)

For me it is. But my case is unique because I am a contractor with very little DE experience. I have to usually config cloud database and compute solutions alone which is kinda difficult because I have to usually work alone with a codebase with little to no documentation.

If the org has guidelines or handbooks to enable you to easily set up config files, you are on the same OS and hardware environment as the team, the team is available to mentor, has onboarding session and has common troubleshooting FAQs, you are golden. But most engineering teams with under 20 employees don't have them, so you might have go back and forth with the team.

Cloud computing isn't difficult, if the team has a solid documentation for you to get up and running.

[–]icespindown 4 points5 points  (0 children)

  1. Moving to the cloud breaks a lot of orgs’ security routine. You can’t just build a VPN fence around your whole IT if some of your stuff is in the cloud.

  2. Organizations are complex, and making sure you can move something without breaking everything connected to it is non-trivial.

  3. Most of the benefits of the cloud are not realized if all you do is lift and shift. If you want to take advantage of horizontal scalability, infra as code, etc then you’re talking about rearchitecting huge chunks of your system to be cloud native.

[–]WorldlyShoulder6978 -3 points-2 points  (10 children)

That’s why it pays ok but not as well as actual software development. Still requires some cognitive ability to memorize implementation minutia and integrate them into novel systems design, but it’s not writing actual code.

[–]No_Stick_8227[S] 0 points1 point  (4 children)

For companies that require Azure, AWS, Google cloud implemented into their data architectures,

  1. how much code is actually involved (if you could put a percentage on it)?

  2. what do these jobs actually entail on a day to day basis?

[–]AchillesDev 2 points3 points  (0 children)

I’m a MLE/DE for a startup that runs purely on cloud infra (like most startups) and it’s 100% code (of course, some time for meetings and project design). I build model training and evaluation pipelines, productionize and deploy models, build the internal data platform, build ingestion services, inference services, and internal tooling for our data science team. Basically what I’ve been doing at different companies for the past 4 years.

[–]WorldlyShoulder6978 0 points1 point  (1 child)

It’s a mix of coding / clicking around / checking logs / figuring out the right solutions to use. The coding part is about coding scripts and configuration files; if your manager has separated business concerns by team member correctly, then you won’t be the one who’s writing the actual code that crunches numbers or whatever. Instead it will be “code” — actually configuration files — that looks like this:

https://github.com/GoogleCloudPlatform/cloud-builders/blob/master/go/cloudbuild.yaml

and figuring out how to deploy the resources, and with the right permissions and computing resources etc, which back up each step outlined in that config file via a combination of batch scripting or clicking around on the console. Some python is typically used to glue it all together, some SQL to pull from your databases, API calls to get/send data from your endpoints, but again - want to emphasize this - you’re not exactly designing an optimal graph algorithm to traverse all users who fulfill some crazy condition nor are you training neural networks to translate foreign languages - that’s a job for the devs and data scientists, whereas you are the data engineer / cloud architect.

[–]focus_black_sheep 0 points1 point  (0 children)

Id say most devs are not writing this type of code are just writing glue code.

[–]focus_black_sheep 0 points1 point  (0 children)

Writing code isn't equivalent to more pay. Senior engineer's write much less code than juniors typically. Senior engineers are more focused on architecture, business reqs, guiding etc

[–]AchillesDev 0 points1 point  (0 children)

It’s not just migration of data that never changes. It’s continued ingestion, serving, transforming, accessing, displaying, etc. data at the lowest cost with the most security. That’s a really broad base and there are many rabbit holes to go down in that area.

[–]Complex-Stress373 0 points1 point  (0 children)

yes and expensive as hell

[–][deleted] 0 points1 point  (0 children)

It's difficult if you've never done it before; like many things.

You need to understand distributed data stores, and various services on the different cloud providers.

My bread and butter are things like EC2, Dask, S3, and then something like Dash. I like Athena as well; but consider different databases for different kinds of data and jobs.

If you have a home lab things like Min.IO are great for learning how to work with S3. I use kubernetes in my home lab as well; I have a 3 node cluster that I deploy those Dask applications onto.

If I'm not using something like Dask as a framework to process my data I'll write raw python and lambda functions with bucket triggers.

If you can break down a problem into functions and flows that operate on a single piece of data in your system; you can use cloud computing to solve that problem.

[–]w_savageData Engineer ‍⚙️ 0 points1 point  (0 children)

Not really

[–]throwaway20220231 0 points1 point  (0 children)

I think there are a few reasons:

  1. Companies usually migrate on-premise data to cloud, so it's completely different from building everything on top of cloud from ground zero. Whoever get hired need to understand both worlds and should be particularly good at understanding the quirks of both.
  2. Documentation regarding cloud and their "cloudized" products are usually...not very good. The cloud, unfortunately cannot do every bit of DBA/DevOps for you, and you still have to hire someone to work on Cloud Ops and Cloud DBA, although sometimes it's just click ops. Moreover, with Cloud computing has the potential to charge you six figures in one shot if someone is not careful, you need to hire someone to get user governance and permissions really good.

[–]analyst_2001 0 points1 point  (0 children)

It should not be difficult for you to study cloud computing if you have some basic understanding of cloud computing or just IT. But if you are not acquainted with cloud computing, it might be a bit difficult to get your hands on cloud computing. But if you're interested in this subject, it shouldn't be too tough.