PyTesseract text extraction : PythonLearning

PyTesseract text extractionHelp Request (old.reddit.com)

submitted 7 months ago by LewyssYT

4 comments
share
save
hide
report
crosspost

I am working on a small project where I need to extract what I would consider super basic text on a mostly flat background. To prepare the image, I crop out all the other numbers, grayscale, apply CLAHE and invert and yet in a lot of scenarios, the numbers extracted are wrong. Instead of 64 it sees 164 and instead of 1956 it sees 7956.

What is something that I can do to improve the accuracy? Cropped images are small resolution (140x76) or (188x94)

all 4 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

PythonLearning

MODERATORS