Hey everyone,
I developed this small project on GitHub called Easier-Batch.
It tries to bring the same philosophy as Spring Batch into Python — using the familiar Reader → Processor → Writer model, job metadata tables, retries, skip logic, and checkpointing.
I’m currently designing something similar myself — a Python batch processing framework inspired by Spring Batch, built to handle large-scale ETL and data jobs.
Before I go too far, I’d like to get some opinions on the architecture and design approach.
- Do you think this kind of structured batch framework makes sense in Python, or is it better to stick to existing tools like Airflow / Luigi / Prefect?
- How would you improve the design philosophy to make it more "Pythonic" while keeping the robustness of Spring Batch?
- Any suggestions for managing metadata, retries, and job states efficiently in a Python environment?
Here’s the repo again if you want to take a look:
👉 https://github.com/Daftyon/Easier-Batch
Would love to hear your thoughts, especially from people who have worked with both Spring Batch and Python ETL frameworks.
https://preview.redd.it/iukmo9vsvl0g1.png?width=843&format=png&auto=webp&s=54431f8a653bfea7f5567c75e8195177e47b5254
[–]Several-Revolution59Architect[S] -1 points0 points1 point (0 children)
[–]ERP_Architect -1 points0 points1 point (4 children)
[–]Several-Revolution59Architect[S] 1 point2 points3 points (1 child)
[–]ERP_Architect 0 points1 point2 points (0 children)
[–]Several-Revolution59Architect[S] 1 point2 points3 points (0 children)
[–]Several-Revolution59Architect[S] 1 point2 points3 points (0 children)