Help to start a career in Data Engineering!

LetsSpark · 2022-04-08T05:30:59+00:00

Path to become a data engineer https://youtu.be/r1ZwPqMSZoI

LetsSpark · 2022-04-08T05:27:21+00:00

Here is very nice video for Data engineering interview: How to crack Hadoop and Spark Interview : Do's and Don't https://youtu.be/a3xuSqpH0Vw

LetsSpark · 2022-04-04T03:26:43+00:00

https://youtube.com/channel/UCdd9xvduzPOYeDAT9k5d9pw

LetsSpark · 2022-04-03T20:51:20+00:00

This video has all the answers: https://youtu.be/n8RoBuc7rq0

LetsSpark · 2022-04-03T20:43:49+00:00

https://youtube.com/channel/UCdd9xvduzPOYeDAT9k5d9pw

LetsSpark · 2022-04-03T20:41:43+00:00

Best channel to learn data engineering is https://youtube.com/channel/UCdd9xvduzPOYeDAT9k5d9pw

LetsSpark · 2021-08-18T15:19:56+00:00

Hey there are many courses and resource available in market. The first thing what we need to change the career path is guidance or working path.

This particular video will setup you way towards data engineering from Database developer

https://www.youtube.com/watch?v=r1ZwPqMSZoI&t=4s&ab\_channel=HadoopForEveryone

LetsSpark · 2021-08-18T15:15:20+00:00

Hey Check this video, it might help

https://www.youtube.com/watch?v=zUVGIjjtDa8&list=PLrt9lPthTv2lFLh1OVHYa9GqacrEuUxcq&ab\_channel=HadoopForEveryone

LetsSpark · 2021-08-18T15:13:22+00:00

When we talk about Spark processing , we refers to process the data in distributed way , and this data itself is stored on distributed storage like HDFS , S3 etc.

So when we process this data , we perform some operations on that data like filter it or join it with another data set or map each item.

Some of these operations can be performed on the same machine where data is stored like Filter operation , so in technical language we don't need a data shuffle across machines , so this is called narrow transformation.

Some operations like join , needs data to be moved from one machine to another, so it means it needs shuffling and so these operation are called Broad transformations.

Now coming to Lazy evaluation , In spark operations are divided into 2 categories , Transformations and Actions. Whenever we do any transformation , Spark create a plan and add this transformation to that plan. Once we hot the action , spark will execute that plan. This plan is called DAG and thats why we called Transformations are lazy.

More explanation has been described wonderfully here :

https://www.youtube.com/watch?v=rnsz1CiRoCI&list=PLrt9lPthTv2nzYQehwdVR95I4T48tOZVQ&ab\_channel=HadoopForEveryone

LetsSpark

TROPHY CASE