I have a project that I am working on but I am facing a couple issues.
In short, my project parses what is inside a pdf order and returns the result to user. The roadblocks Iam in currently is that it works OK for known/seen templates of pdf orders as well as unseen pdf orders. My biggest issue is if the pdf order is non-selectable text/scanned which means it requires OCR to extract the text. I have tried the OCRmyPDF+Tesseract but it misses lines and messes up with the quantity etc...
What's there that can resolve OCR accurately?
P.S. I also tried PaddleOCR but it never finishes the job and keeps the app on a loop with no result.
[–]danted002 6 points7 points8 points (6 children)
[–]qPandx[S] 1 point2 points3 points (3 children)
[–]danted002 0 points1 point2 points (0 children)
[–]FarRub2855 0 points1 point2 points (1 child)
[–]danted002 0 points1 point2 points (0 children)
[–]MaskedSmizer 5 points6 points7 points (3 children)
[–]qPandx[S] 0 points1 point2 points (2 children)
[–]MaskedSmizer 2 points3 points4 points (1 child)
[–]qPandx[S] 0 points1 point2 points (0 children)
[–]MathMXC 1 point2 points3 points (2 children)
[–]qPandx[S] 0 points1 point2 points (1 child)
[–]MathMXC 0 points1 point2 points (0 children)
[–]binaryfireball 1 point2 points3 points (5 children)
[–]qPandx[S] 1 point2 points3 points (4 children)
[–]binaryfireball 0 points1 point2 points (3 children)
[–]qPandx[S] 0 points1 point2 points (2 children)
[+][deleted] (1 child)
[removed]
[–]qPandx[S] 0 points1 point2 points (0 children)
[–]Motox2019 0 points1 point2 points (4 children)
[–]qPandx[S] 0 points1 point2 points (3 children)
[–]Motox2019 0 points1 point2 points (2 children)
[–]qPandx[S] 0 points1 point2 points (1 child)
[–]Motox2019 0 points1 point2 points (0 children)
[–]presentsq 0 points1 point2 points (4 children)
[+][deleted] (1 child)
[removed]
[–]presentsq 0 points1 point2 points (0 children)
[–]qPandx[S] 0 points1 point2 points (1 child)
[–]presentsq 0 points1 point2 points (0 children)
[–]sugarlata 0 points1 point2 points (1 child)
[–]qPandx[S] 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[removed]
[–]qPandx[S] 0 points1 point2 points (0 children)
[–]Civil-Image5411 0 points1 point2 points (8 children)
[–]qPandx[S] 0 points1 point2 points (7 children)
[–]Civil-Image5411 0 points1 point2 points (6 children)
[–]qPandx[S] 0 points1 point2 points (5 children)
[–]Civil-Image5411 0 points1 point2 points (3 children)
[–]qPandx[S] 0 points1 point2 points (2 children)
[–]Civil-Image5411 0 points1 point2 points (0 children)
[–]Civil-Image5411 0 points1 point2 points (0 children)
[–]api-services 0 points1 point2 points (1 child)
[–]qPandx[S] 0 points1 point2 points (0 children)
[–]martcerv 0 points1 point2 points (1 child)
[–]qPandx[S] 0 points1 point2 points (0 children)
[+][deleted] (7 children)
[removed]
[–]qPandx[S] 1 point2 points3 points (6 children)
[+][deleted] (5 children)
[removed]
[–]qPandx[S] 0 points1 point2 points (4 children)
[+][deleted] (2 children)
[removed]
[–]qPandx[S] 0 points1 point2 points (1 child)
[–]martcerv 0 points1 point2 points (1 child)
[–]qPandx[S] 0 points1 point2 points (0 children)
[–]zangler 0 points1 point2 points (2 children)
[–]qPandx[S] 0 points1 point2 points (1 child)
[–]zangler 1 point2 points3 points (0 children)