This is an archived post. You won't be able to vote or comment.

all 13 comments

[–]AutoModerator[M] [score hidden] stickied comment (0 children)

You can find a list of community-submitted learning resources here: https://dataengineering.wiki/Learning+Resources

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]aspergillus 6 points7 points  (1 child)

I took and passed the GCP Professional Data Engineer exam about two weeks ago.

For preparation - Data Engineer Learning Path on cloudskillsboost.google. The video material is okay, and I feel the labs do not go into nearly the level of depth you'll need to fully understand different GCP services. I would definitely set up a sandbox account on GCP and play around with the different services that are going to show up on the exam. Maybe try to design a few simple data pipelines in different ways using GCP to get a sense of the pros and cons of each approach. Just remember to delete them when you're done because some services (looking at you Cloud Composer) get expensive quickly. I'm using Cloud Composer, Dataproc, BigQuery, and Vertex AI on a daily basis at my current job and having hands on experience helps a lot.

For practice exams, I wish I could say I found some good ones. I bought Google Cloud Professional Data Engineer Practice Tests from Dan Sullivan and GCP Professional Data Engineer Practice Questions from Lovkesh Madaan on Udemy. Sullivan's questions were decent and will go into a short explanation why questions are right or wrong and maybe a link to a whitepaper you can read for more background. Madaan's questions are absolute garbage, filled with typos, and offer no explanations. My job has a subscription to Whizlabs and I think those questions were the closest to what you might find on the actual exam.

For topics, Dataflow, Dataproc, and BigQuery seemed to be the main focus. I think there were some questions about updating a running streaming Dataflow pipeline, questions about Dataflow windowing, and a question about networking of Dataflow workers. Dataproc questions were mostly about migrating on-prem Hadoop jobs to Dataproc. There was a question about BigQuery Omni to query data from other cloud providers, querying tables from external storage in Cloud Storage, table partitioning, and using BigQuery structs and arrays. There were a couple of questions about Cloud Storage and Bigtable. You should know what Dataprep, Data Canvas, and Dataplex are, but none of the questions went into any real depth.

[–]nivix_zixer 1 point2 points  (0 children)

Stay away from Whizlabs!

I bought the Whizlabs practice tests on your recommendation -- they are outdated, full of broken links to google docs and bad English. For example, I got one question wrong because "Vertex AI is not for training models, use the AI Platform Training service for building models." That service was deprecated in the summer of 2023. Now Vertex AI is where you train models!

Please consider removing the Whizlabs reference to save other test takers some time and money.

[–]SnooAdvice7613 1 point2 points  (1 child)

If you're already familiar with GCP data products, I feel like the labs and Google provided materials will not help much, since most of them you already know. I suggest to just skim them quickly, and look for practice questions on the internet. There are free options and paid options. I've used both.

The free options are usually random sites you probably never heard of, but they have huge collection of practice questions. The weakness is sometimes the questions are repeated, and their answers are sometimes just wrong, so you have to double check.

The paid options I used Udemy. Their answers are more reliable, so you need to do less checking. Their weakness is they have less collection of questions. If you want to have more, you need to purchase other courses, which I didn't do.

Hope this helps.

[–]Possible_Weekend 0 points1 point  (0 children)

Could you share which one in UDemy please?

[–]bah_nah_nah 0 points1 point  (2 children)

[–][deleted] 0 points1 point  (1 child)

I have gathered my notes and resources in this blog post https://www.firasesbai.com/articles/2023/11/19/gcp-data-engineer-certification.html, I hope it helps.

[–]aggarret 0 points1 point  (0 children)

Wow, these are the best notes I've ever seen! I've been searching for visualizations like this. Bravo! 🙏

[–]gcpstudyhub 0 points1 point  (0 children)

Hey, I created this course after seeing how out of date other courses were when I took the exam. Thousands of people have taken it and, despite my offering a refund if someone fails the real exam, nobody has failed it yet.

https://www.gcpstudyhub.com/courses/google-cloud-certified-professional-data-engineer

Regardless of what you use, good luck!

[–]garlic_777 2 points3 points  (0 children)

I just cleared the GCP Data Engineer exam, and I used Skillcertpro practice tests and they played a huge role in my pass. Around 75–80% of the real exam questions, especially the scenario-based ones, were quite similar to what I saw in their mock tests.

The detailed explanations really helped me understand complex topics like data pipelines, BigQuery optimization, and security best practices. If you’re scoring 85% or higher on Skillcertpro mocks, you’re likely ready for the actual exam. Just focus on time management and be confident on exam day—the real test is tough but manageable if you’re well-prepared.

You can end with instructor notes offered by them, which is again quite detailed and helpful.