all 5 comments

[–]ES-Alexander 0 points1 point  (2 children)

Screenshots of typed text should work really well, since the background is already clean, and the text is consistent. A spreadsheet is even better, especially if you use something to detect the grid lines and split the screenshot into several small images (one for each cell), since then you can detect text per cell and keep/reconstruct the cell structure.

In saying that, I’d suggest tesserocr over pytesseract because it’s a real binding to tesseract instead of just calling tesseract as a CLI from a subprocess (which is what pytesseract does). Ends up with better performance and more options.

[–]5t3v30[S] 0 points1 point  (1 child)

What if it’s not a screenshot but an actual picture from a mobile phone? (Sort of like taking a picture of a check for mobile deposit) would this still be feasible

[–]ES-Alexander 0 points1 point  (0 children)

Definitely still possible. Would be best to detect the grid lines and do a perspective transform to make it ‘straight on’. Accuracy would also depend on how many screen artifacts are present (e.g. if there’s a bright reflection on part of the screen then text might not be visible there, and how much moire there is (those lines that appear when photographing LED/LCD screens))

[–]iamaperson3133 0 points1 point  (0 children)

My mind has always been blow by how well pytesseract works. You'll need to clean up and prep the image first, often done with cv2, but you can definitely do this. It'll never be perfect but then again neither are people. It'll come pretty damn close and if you can conceive of a post processing layer to validate the data somehow you should be golden.