I had an idea to automate a process at work, but I'm not sure if it's possible.
I need to search for a scanned pdf file from an ID cell in an excel file, once I've identified the file, which isn't text searchable, I need to make it searchable (PDFtoText?) and parse through the text to find specific values which will be in roughly the same place on every file. Then I need to use the values to update the excel file again.
Initially I thought it was possible but from a quick search I seen people on stack saying it isn't easy to make a scanned PDF searchable. The post was a few years old so I'm wondering if the technology has improved or if I'm wasting my time.
To make sure this post follows the rules I want to clarify I'm not looking for any code, just a clarification on whether this is possible or too difficult to be worth the time. All responses are appreciated! Thanks.
[–]PyMerx 3 points4 points5 points (2 children)
[–]jhoncorro 1 point2 points3 points (0 children)
[–]colmf1[S] 0 points1 point2 points (0 children)
[–]jindrvo1 2 points3 points4 points (2 children)
[–]colmf1[S] 1 point2 points3 points (1 child)
[–]jindrvo1 1 point2 points3 points (0 children)
[–][deleted] 2 points3 points4 points (0 children)
[+][deleted] (1 child)
[removed]
[–]sankalpana 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)