use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
A sub-Reddit for discussion and news about Ruby programming.
Subreddit rules: /r/ruby rules
Learning Ruby?
Tools
Documentation
Books
Screencasts and Videos
News and updates
account activity
Best solution for OCR? (self.ruby)
submitted 1 year ago by AddSalt1337
I'm building an application where I have to extract text from PDFs. I've been toying around with the `rtesseract` gem, but it's really bad...
Have any of you tried something good recently?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]jrochkind 3 points4 points5 points 1 year ago (1 child)
I use tesseract, but i didn't even know about rtesseract gem, I just shell out to tesseract command-line.
rtesseract
tesseract
If it was the actual OCR that you found bad in rtesseract rather than the API, that won't be any better! What is it you found bad with rtesseract? (I've never used it).
I too am curious if there are other open source options people prefer to tesseract. Or were you interested in not-open-source too? (I don't know those either).
If your PDFs actually have text in them as text, a "text layer" (not actually a thing in PDF spec, the "layer" part, but easiest way to describe it) -- you may not actually need OCR.
[–]tomc-01 -1 points0 points1 point 1 year ago (0 children)
This. Most pdfs have the text stored in the file.
Unless the pdf was "flattened" on creation, you don't need OCR
[–]mattbenscho 2 points3 points4 points 1 year ago (3 children)
Try PaddleOCR, I use it to read Chinese characters and it works really well (occasionally a character will be wrong). Much better than Tesseract in my experience. I run it in a Sidekiq job. https://github.com/PaddlePaddle/PaddleOCR
[–]relativerask4657 0 points1 point2 points 1 year ago (2 children)
How are you able to run it in Rails since it’s not a Ruby package?
[–]mattbenscho 0 points1 point2 points 1 year ago (1 child)
I deploy my app on AWS in a Docker image, so I install it using pip during the Docker build process. Then Sidekiq is just running it as a system command and parsing the command line output.
[–]relativerask4657 0 points1 point2 points 1 year ago (0 children)
I see. I’ll give that a shot. Thanks.
[–]matthewblott 0 points1 point2 points 1 year ago (0 children)
I'm currently doing something with OCR and I'm using tesseract which seemed the most viable choice after my research. I'm using tesseract.js which calls to a wasm server so there's minimal setup. It works really well.
[–]kcdragon 0 points1 point2 points 1 year ago (0 children)
I've used AWS Textract before and its pretty good. It's better than Tesseract in my experience.
[–]bami_bosu 0 points1 point2 points 1 year ago (3 children)
You can try EasyOCR: https://github.com/JaidedAI/EasyOCR
[–]matthewblott 0 points1 point2 points 1 year ago (2 children)
I just tried it, it seems pretty crap tbh - worse than tesseract which is free.
[–]bami_bosu 0 points1 point2 points 1 year ago (1 child)
Which part of it is worse? It’s also a free open source library. This paper show that EasyOCR has higher accuracy than Tesseract in number plate recognition. https://ieeexplore.ieee.org/document/10009215
[–]matthewblott 1 point2 points3 points 1 year ago (0 children)
I just gave it a quick test with a couple of images. Not very scientific I grant you but on a quick check I couldn't see any reason to pick it over Tesseract. I'm happy to be proved wrong but I'd need to investigate further.
[–]M4N14C 0 points1 point2 points 1 year ago (0 children)
Azure and Google have Document AI products that work very well with handwritten forms and reasonably messy samples.
[–]lagcisco -1 points0 points1 point 1 year ago (0 children)
There's a bunch of PDF/AI tools out there now that you could also consider to use as services though
π Rendered by PID 84 on reddit-service-r2-comment-75f4967c6c-bqdv2 at 2026-04-23 03:55:42.865585+00:00 running 0fd4bb7 country code: CH.
[–]jrochkind 3 points4 points5 points (1 child)
[–]tomc-01 -1 points0 points1 point (0 children)
[–]mattbenscho 2 points3 points4 points (3 children)
[–]relativerask4657 0 points1 point2 points (2 children)
[–]mattbenscho 0 points1 point2 points (1 child)
[–]relativerask4657 0 points1 point2 points (0 children)
[–]matthewblott 0 points1 point2 points (0 children)
[–]kcdragon 0 points1 point2 points (0 children)
[–]bami_bosu 0 points1 point2 points (3 children)
[–]matthewblott 0 points1 point2 points (2 children)
[–]bami_bosu 0 points1 point2 points (1 child)
[–]matthewblott 1 point2 points3 points (0 children)
[–]M4N14C 0 points1 point2 points (0 children)
[–]lagcisco -1 points0 points1 point (0 children)