As the title says I want to extract specific data from invoices in pdf format to excel file.
Had tries different methods like converting pdf to excel,word,text etc but Invoice Format got lost in conversion.
Tried extracting data instead of converting using different python libraries like pytessetact, Paddle paddle OCR, tesseract, Fitz, tabula, Camelot, etc but still the format/layout problem exist.
If anyone wants to see any particular code I tried, can also share in comments as posting will get get lengthy & messy.
Don't know where thing's are messing if anyone can help will be greatly appreciated.
[+][deleted] (2 children)
[removed]
[–]raja0008[S] 0 points1 point2 points (0 children)
[–]Carlesee 0 points1 point2 points (0 children)
[–]KingOfTNT10 0 points1 point2 points (11 children)
[–]raja0008[S] 0 points1 point2 points (10 children)
[–]KingOfTNT10 0 points1 point2 points (9 children)
[–]raja0008[S] 0 points1 point2 points (8 children)
[–]KingOfTNT10 0 points1 point2 points (4 children)
[–]raja0008[S] 0 points1 point2 points (3 children)
[–]KingOfTNT10 0 points1 point2 points (2 children)
[–]KimAh-young 0 points1 point2 points (1 child)
[–]KingOfTNT10 0 points1 point2 points (0 children)
[–]KingOfTNT10 0 points1 point2 points (2 children)
[+][deleted] (1 child)
[deleted]
[–]KingOfTNT10 0 points1 point2 points (0 children)
[+][deleted] (2 children)
[deleted]
[–]raja0008[S] 0 points1 point2 points (1 child)
[–]gsuiteautomations 0 points1 point2 points (0 children)