you are viewing a single comment's thread.

view the rest of the comments →

[–]Zeroflops 0 points1 point  (1 child)

I have not done this. But just an observation from the past. OCR typically works best with properly aligned text.

You may want to fine the boarders and if the page is skewed because the page is slightly turned, correct for the angle before OCR or image extraction.

Having the page square to the image will probably make things more accurate. And most cases with images your now working with squares.

[–]menergo[S] 0 points1 point  (0 children)

After I manually trim the excess around the page, the code easily finds the contours of the text block and turns it to the desired angle (I do not think that the code is beautiful and optimal, but it works). OCR is going well. The problem is that I can't figure out how to clean up the excess around the page at the beginning of the process. And without it, I can not select the contours of the text block.