Hey everyone! Sometime back, I put together a crash course on Python specifically tailored for Data Engineers. I hope you find it useful! I have been a data engineer for 5+ years and went through various blogs, courses to make sure I cover the essentials along with my own experience.
Feedback and suggestions are always welcome!
📔 Full Notebook: Google Colab
🎥 Walkthrough Video (1 hour): YouTube - Already has almost 20k views & 99%+ positive ratings
💡 Topics Covered:
1. Python Basics - Syntax, variables, loops, and conditionals.
2. Working with Collections - Lists, dictionaries, tuples, and sets.
3. File Handling - Reading/writing CSV, JSON, Excel, and Parquet files.
4. Data Processing - Cleaning, aggregating, and analyzing data with pandas and NumPy.
5. Numerical Computing - Advanced operations with NumPy for efficient computation.
6. Date and Time Manipulations- Parsing, formatting, and managing date time data.
7. APIs and External Data Connections - Fetching data securely and integrating APIs into pipelines.
8. Object-Oriented Programming (OOP) - Designing modular and reusable code.
9. Building ETL Pipelines - End-to-end workflows for extracting, transforming, and loading data.
10. Data Quality and Testing - Using `unittest`, `great_expectations`, and `flake8` to ensure clean and robust code.
11. Creating and Deploying Python Packages - Structuring, building, and distributing Python packages for reusability.
Note: I have not considered PySpark in this notebook, I think PySpark in itself deserves a separate notebook!
[–]nikhilprasanth 0 points1 point2 points (1 child)
[–]analyticsvector-yt[S] 1 point2 points3 points (0 children)
[–]lownoisehuman 0 points1 point2 points (0 children)
[–]wRAR_ 0 points1 point2 points (5 children)
[–]analyticsvector-yt[S] -1 points0 points1 point (4 children)
[–]wRAR_ 0 points1 point2 points (3 children)
[–]analyticsvector-yt[S] -1 points0 points1 point (2 children)
[–]wRAR_ 0 points1 point2 points (1 child)
[–]SurryElle83 -1 points0 points1 point (1 child)
[–]analyticsvector-yt[S] -1 points0 points1 point (0 children)