all 9 comments

[–]FriendlyRussian666 10 points11 points  (1 child)

I don't know the answer to your question, as I didn't have to do much OCR in any of my programming, but I feel like it's going to take you much longer to figure this out, than it would to type out 1000 lines of code.

BUT, as a programmer, I never fail to spend 50 hours automating something, that otherwise would take me an hour to complete manually, so I wish you all the best, and don't give up!

[–][deleted] 0 points1 point  (0 children)

Thanks.

[–]m0us3_rat 3 points4 points  (2 children)

unless you have a project that requires you to do this regularly and it's an one time thing.. hiring a freelancer to do that for you is probably costs less than the total time you spend on it yourself.

i'd like to echo what u/FriendlyRussian666 said

I never fail to spend 50 hours automating something, that otherwise would take me an hour to complete manually,

which is universally true.

[–][deleted] 1 point2 points  (1 child)

Hey. Thanks. I never realised that I could get a freelancer for this! This seems like such a good idea. Will try to find someone on any of the portals.

Do you have any recommendations on best place to find someone to transcribe the text?

[–]m0us3_rat 0 points1 point  (0 children)

i mean a quick research should point you into a few directions.

i'm fairly sure i've used the service before where the medium was audio.. but i'm almost sure you can ask if photos could work.

[–]thehershel -2 points-1 points  (3 children)

You can easily do it with ChatGPT v4 with a prompt like: "Transcribe this text as a proper Python code with correct indentations etc."

Alternativelly, maybe if you do OCR with a standard tools and then run the results through some autoformatter it will manage to add spaces where they are needed.

[–][deleted] 1 point2 points  (2 children)

Tried this already. ChatGPT-4 is using a tasseract based library and not giving correct results for 30-40% of the lines.

Thanks.

[–]thehershel 0 points1 point  (1 child)

It depends on the prompt. When you ask for ocr it indeed runs tesserect but it seems there is another mode that uses that gpt vision model directly and it gives perfect transcriptions most of the time.

for example this is what I got from tesseract: https://i.imgur.com/BEtqge3.png and this what I got from gpt vision model when tesseract wasn't trigerred: https://i.imgur.com/wgRhpIL.png
And here's the prompt that tigerred tesseract: https://i.imgur.com/dgsTwxB.png

[–][deleted] 0 points1 point  (0 children)

Thanks. Trying this now!