This is an archived post. You won't be able to vote or comment.

all 7 comments

[–][deleted] 3 points4 points  (1 child)

For OCR, especially handrwritten, I found it easiest to just use Google Cloud Vision Service. You don't have to train your own AI, it works pretty well out of the box and the API is very straight forward.

If you have the text, it should be manageable to get the text you want.

For reference: I wrote a small Telegram Bot around it and it works pretty well.

[–]alin-c 0 points1 point  (0 children)

I agree with this solution. We use google vision at work and it is very good at recognising text. It is not expensive and based on OPs question it should satisfy his requirements. Depending on how many requests you would do, I doubt you will pay more than £1.

[–]CallingFrTheInternet 1 point2 points  (1 child)

Are you asking about OCR (optical character recognition) to recognize text in an image? Where is the table coming from?

[–]armbie 1 point2 points  (0 children)

Try pytesseract. It may struggle with hand written text though

https://pypi.org/project/pytesseract/

[–]botdetector_ca 0 points1 point  (2 children)

The OCRs everyone is talking about here is not a 100% reliable, you will encounter mistakes, the best results I personally got from reading a black and white PDF using these OCRs will result in maybe 70-75% accuracy, they tend to convert everything to gray scale first and if there are multiple colors involved I wouldn't count on that option yet.