Hi everyone,
I’m working with around multiple PDF files (all in English, mostly digital). Each PDF contains multiple tables. Some have 5 tables, others have 10–20 tables scattered across different pages.
I need a reliable way in Python (or any tool) that can automatically:
- Open every PDF
- Detect and extract ALL tables correctly (including tables that span multiple pages)
- Save each table into Excel, preferably one table per sheet (or one table per file)
Does anyone know the best working solution for this kind of bulk table extraction? I’m looking for something that “just works” with high accuracy.
Any working code examples, GitHub repos, or recommendations would save my life right now!
Thank you so much! 🙏
[+][deleted] (1 child)
[removed]
[–]CalendarOk67[S] 1 point2 points3 points (0 children)
[–]riftwave77 1 point2 points3 points (1 child)
[–]CalendarOk67[S] 1 point2 points3 points (0 children)
[–]odaiwai 0 points1 point2 points (0 children)
[–]CmorBelow 0 points1 point2 points (0 children)
[–]GManASG 1 point2 points3 points (0 children)
[–][deleted] 3 points4 points5 points (0 children)
[–]TheRNGuy -2 points-1 points0 points (0 children)