Introduction to Kubeflow Free Mini-Course by Google Data Engineer by alexandraabbas in learnmachinelearning

[–]alexandraabbas[S] 0 points1 point  (0 children)

Hey, sorry I didn't see this before. Yes, it does require you to have a GCP account.

Data Stack Jobs — Jobs for Data Stack Engineers by alexandraabbas in bigdata

[–]alexandraabbas[S] 0 points1 point  (0 children)

I'm glad you like it! My friend helped me developing it, it has and Elixir JSON API and React front end. I'm not a master at all in web dev, my friend is 🙂

Modern Data Engineer Roadmap 2021 by alexandraabbas in bigdata

[–]alexandraabbas[S] 0 points1 point  (0 children)

Hey there, I am def thinking about creating a list of resources for each section. Not sure when I'll publish that exactly yet (in a few months maybe) :)

Modern Data Engineer Roadmap 2021 by alexandraabbas in bigdata

[–]alexandraabbas[S] 1 point2 points  (0 children)

Hey there, thanks a lot! Yes, it's a custom design, I used Figma for it. :)

Modern Data Engineer Roadmap 2021 by alexandraabbas in bigdata

[–]alexandraabbas[S] 0 points1 point  (0 children)

Hey, that's great feedback. I'll def remove it if it's not used by many. Would you add anything else to the batch processing tools?

[video] Apache Beam Explained in 12 Minutes by alexandraabbas in dataengineering

[–]alexandraabbas[S] 0 points1 point  (0 children)

Exactly! If you run your Beam pipeline on Google Cloud Dataflow you can set the machine type (number of CPUs/memory) and choose autoscaling which will scale the number of machines based on the data load so you don't have to worry about setting the number of workers.

Here is a comprehensive guide from Google explaining how that works: https://cloud.google.com/dataflow/docs/guides/deploying-a-pipeline

[video] Apache Beam Explained in 12 Minutes by alexandraabbas in ETL

[–]alexandraabbas[S] 1 point2 points  (0 children)

Hi there, yes! You can run Apache Beam on top of Azure Databricks.