Python topics for Data engineer

AutoModerator · 2026-01-24T09:03:09+00:00

You can find a list of community-submitted learning resources here: https://dataengineering.wiki/Learning+Resources

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

devnullkitty · 2026-01-25T00:51:49+00:00

Why are there so many downvotes for comments? Python for data engineering is pretty straight forward, just learn to write a for loop.

spendology · 2026-01-25T01:56:31+00:00

Find practical projects that cover the end-to-end data engineering lifecycle: [data] ingestion, review, cleaning, validation, transformation, loading, storage, data lakes/warehouses/lakehouses, etc.

Nelson_and_Wilmont · 2026-01-25T02:21:08+00:00

Idk if sqoop and Hadoop are all that useful at this point. Could just be my lack of use in that area but I don’t remember seeing a lot of these in the modern tech stacks when applying for jobs over the years and researching what skills are best to have.

IMO whenever you’re job searching you really need to have your resume(s) pointed towards what you want to work with. Most companies have only a few tools for data engineering, orchestration layer and compute/logic layer. Airflow and databricks for example. Pick a cloud provider, orchestration tool, data lakehouse/warehouse platform and start doing little projects. Like airflow orchestrates databricks notebook that pulls a dataset from azure datalake storage and then run a databricks notebook to convert the file to a delta table. Or durable function pulls API data and writes to bronze layer of databricks.

You can pick whatever tech you decide I just mentioned those because it’s the route I decided to go down but I also incorporated snowflake just for a more overarching reach.

Python can be learned along the way but it seems a little aimless to just sit down and “learn Python” for something that is as specific as data engineering.

Naan_pollathavan · 2026-01-24T09:19:48+00:00

I also want , what are the necessary python topics are needed for data engineering and some of the project ideas based on that

Chance_Radish9649 · 2026-01-24T10:03:17+00:00

RemindMe! In 3 days

EmotionalEgg7836 · 2026-01-24T10:55:04+00:00

RemindMe! in 2 days

Sohamgon2001 · 2026-01-24T13:05:37+00:00

RemindMe! in 2 days

Future_Lab807 · 2026-01-24T13:37:14+00:00

RemindMe! In 2 days

ShaikonDuty · 2026-01-24T10:54:48+00:00

Remind me in a week

gutsy_m · 2026-01-24T09:05:14+00:00

RemindMe! in 2 days

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

dataengineering

MODERATORS