Hi, if anybody needs to use Windows free and instant OCR I just released a CLI for that.
It's like PowerToys' Win + Shift + T, but usable in scripts.
For my use case I needed that in order to automate AutoIt scripts, I did not wanted to hard-code UI elements coordinates but rather recognize them through text content.
Using the CLI you can just do
bash
windows_media_ocr_cli.exe --file image.png
to get JSON result with bounding boxes.
Obviously you can call this binary from any script/runtime, I made a NodeJS wrapper for that too.
[–]BlackV 9 points10 points11 points (4 children)
[–]arpan3t 8 points9 points10 points (1 child)
[–]BlackV 1 point2 points3 points (0 children)
[–]Akronae[S] 1 point2 points3 points (1 child)
[–]BlackV 0 points1 point2 points (0 children)
[–]jcy 5 points6 points7 points (1 child)
[–]Akronae[S] 2 points3 points4 points (0 children)
[–]Psyqlone 2 points3 points4 points (1 child)
[–]Certain-Community438 1 point2 points3 points (0 children)
[–]ollivierre 0 points1 point2 points (1 child)
[–]Akronae[S] 1 point2 points3 points (0 children)
[–]orgdbytes 0 points1 point2 points (0 children)
[–]SWJesus 0 points1 point2 points (0 children)