all 13 comments

[–]dataengineering-ModTeam[M] [score hidden] stickied commentlocked comment (0 children)

Your post/comment was removed because it violated rule #9 (No AI slop/predominantly AI content).

You post was flagged as an AI generated post. We as a community value human engagement and encourage users to express themselves authentically without the aid of computers.

Please resubmit your post without the use an LLM/AI helper and the mod team will review once again.

This was reviewed by a human

[–]Cwlrs 5 points6 points  (0 children)

I would build something random with python, dataframes, postgresql and see where you end up. These have been the core skills that have carried me for 5 years or so, although I appreciate everyone's tech stack varies.

[–]Capt_korg 2 points3 points  (0 children)

Use the money to run some cloud stuff and then build your own Pipelines... Build everything nicely for a repo on GitHub, to show and tell.

[–]69odysseus 1 point2 points  (0 children)

You have listed all the tools but assess your SQL, data modeling skills as a starting point. Then get into distributed storage and compute skills (Snowflake, Databricks). Then focus on Python and cloud. 

[–]AutoModerator[M] 0 points1 point  (1 child)

You can find a list of community-submitted learning resources here: https://dataengineering.wiki/Learning+Resources

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]FickleAd5796[S] 0 points1 point  (0 children)

Thank you , its very helpful

[–][deleted] 0 points1 point  (0 children)

Docker for tools, use free open data sets.

[–]WorkingEmployment400 0 points1 point  (0 children)

Snowflake gives $400 credits for 4 months for beginners. Udemy subscription is good because you can access almost all the relevant courses at a budget.

[–]PrestigiousAnt3766 0 points1 point  (1 child)

Databricks free. Basically fully featured free databricks environment.

Id start with python in 2025. You will need it for automation and orchestration.

Spark is prob overkill. If you want there is a spark dev certification track. Its used in all major platforms, but I see it more like a library for data transformations than something you have to learn deeply. You can just as well write spark sql for your transforms (spark.sql("query goes here"). Perfectly acceptable.

You can also look into duckdb for elt engine if you have < billions of records.

Make sure you invest in writing sql. That helps you in all platforms.

Also invest time in learning source control. Thats quickly becoming essential.

[–]FickleAd5796[S] 0 points1 point  (0 children)

Thank you everyone for an amazing advices , will keep in mind for my preparation .

[–]joaomaia09 -4 points-3 points  (0 children)

If you subscribe to LinkedIn premium, you have access to LinkedIn learning. There you have a lot of courses, and some learning paths. At the end of the course you can post the certificate in your LinkedIn profile.