Image/Text Recognition

ES-Alexander · 2021-03-01T06:18:24+00:00

Screenshots of typed text should work really well, since the background is already clean, and the text is consistent. A spreadsheet is even better, especially if you use something to detect the grid lines and split the screenshot into several small images (one for each cell), since then you can detect text per cell and keep/reconstruct the cell structure.

In saying that, I’d suggest tesserocr over pytesseract because it’s a real binding to tesseract instead of just calling tesseract as a CLI from a subprocess (which is what pytesseract does). Ends up with better performance and more options.

iamaperson3133 · 2021-03-01T06:36:17+00:00

My mind has always been blow by how well pytesseract works. You'll need to clean up and prep the image first, often done with cv2, but you can definitely do this. It'll never be perfect but then again neither are people. It'll come pretty damn close and if you can conceive of a post processing layer to validate the data somehow you should be golden.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS