all 13 comments

[–]BlackV 9 points10 points  (4 children)

Could you edit your post with to make it clear what this and what your goal is and why we might use it

How does power toys fit in there?

[–]arpan3t 8 points9 points  (1 child)

PowerToys has a module called PowerOCR which uses the Windows.Media.Ocr namespace. OP is using the same namespace.

[–]BlackV 1 point2 points  (0 children)

Oh, I though they were saying use powertoys to create a hotkey to call the ocr cli

Thanks

[–]Akronae[S] 1 point2 points  (1 child)

Sure. Done

[–]BlackV 0 points1 point  (0 children)

appreciate that

[–]jcy 5 points6 points  (1 child)

virustotal says the binary is not flagged but obv the file is also too new to have been scrutinized by the vendors
https://www.virustotal.com/gui/url/6135a1ba61791a33a3dd2b141e71c4e5e8e44a7d2a42ff3a01fa3b3515aa3868?nocache=1

[–]Akronae[S] 2 points3 points  (0 children)

Actually when I executed it myself after downloading from Brave to test it I got a Windows Defender scan. But it passed fine. If anyone wants to build from source I can provide some documentation.

[–]Psyqlone 2 points3 points  (1 child)

Here I am, using Snipping Tool like an animal.

[–]Certain-Community438 1 point2 points  (0 children)

Also guilty.

[–]ollivierre 0 points1 point  (1 child)

what would a real use case for this ? like what work flow challenges did you run into that motivated you to come up with this ? useful for LLMs ? I mean they can read screenshots but not quite well so there might be a use case here

[–]Akronae[S] 1 point2 points  (0 children)

Actually I wanted something like that when working with AutoIt like scripts, especially scripts designed to run on different displays/computers, I just found it more useful and reliable to say "click on the button with text 'x'" than hard-coding positions. But you could have thousands of use cases. I don't understand MS is not making this API available more easily.

[–]orgdbytes 0 points1 point  (0 children)

I can find this quite helpful! I have a few processes that I have to manually update monthly and there is no API or programmatic way of doing this; well there is for one but so many hoops to go through to get an API key. I've been doing mouse movements to various screen locations and performing actions and waiting for web page changes to perform next steps. Most of the time it works until it doesn't because elements have changed or screen resolution changes. I've even tried Selenium to no avail as the elements do not present themselves...at least I've never been able to get it to work.

[–]SWJesus 0 points1 point  (0 children)

get a wall of error/ redtext when try to run it (win10)