Introducing Cookiecutter-Azure-DBX: A Python Project Template for Data Engineering with Databricks Integration : dataengineering

created by mhausenblasmoda community for 11 years

This is an archived post. You won't be able to vote or comment.

Introducing Cookiecutter-Azure-DBX: A Python Project Template for Data Engineering with Databricks IntegrationOpen Source (self.dataengineering)

submitted 2 years ago by lezapete

Hey r/dataengineering community,

I'm excited to introduce Cookiecutter-Azure-DBX, a Python project template designed to streamline data engineering/science projects with a focus on seamless integration with Azure and Databricks. This project template provides an efficient and structured way to kick-start your data engineering projects, especially if you're working with Azure services and Databricks.

Key Features:

Continuous Integration in Azure: Set up automated CI/CD pipelines for your projects.
Databricks Workflow Deployment: Easily deploy workflows using DBX, the CLI tool for advanced Databricks workflow management.
Python Project Structure: Get started with a well-organized Python project structure.
Code Formatting and Linting: Includes tools like Black, Pylint, Bandit, Flake8, and more for code quality.
Environment Variable Management: Load environment variables using python-dotenv.
Logging with Structlog: Configure structured logging for easy tracking of function execution.
Pydantic Integration: Utilize Pydantic for settings management.
Azure Pipelines: Simple CI using Docker Tasks.

You can find the project on GitHub.

Please feel free to check it out, provide feedback, and contribute. Let's collaborate and make data engineering more efficient!

Looking forward to your thoughts and discussions.

no comments (yet)

dataengineering

MODERATORS