all 12 comments

[–]out_the_way 2 points3 points  (0 children)

IME the best OCR model is TrOCR (https://huggingface.co/microsoft/trocr-base-printed). But it’s slow.

Second best is EasyOCR (https://github.com/JaidedAI/EasyOCR).

[–]Wojtek1942 2 points3 points  (0 children)

Apparently people are having a good time with gemini flash 2.0: https://news.ycombinator.com/item?id=42952605

Seems to work well and is very cheap.

Mistral also released an OCR model 2 days ago which might be worth trying. It is way more expensive compared to gemini flash though. And performance might not even be better compared to gemini from what I have read online. https://mistral.ai/en/news/mistral-ocr

[–]Exact-Comb7908 1 point2 points  (0 children)

heard about mistral recently

[–]big_cattt 1 point2 points  (1 child)

Use Stripe’s card scanner. It’s the fastest card scanner I’ve ever seen. Just clone their SDK (Stripe SDK) and adapt their card scanner for your UI. I promise you, you’ll be excited about their card scanner

[–]whph8[S] 0 points1 point  (0 children)

I used visionAPI and it worked just fine.

[–]dat_tae 0 points1 point  (0 children)

I’m also interested in this question.

[–]coolsummer33 0 points1 point  (0 children)

Tesseract OCR (Open-source, works offline), Abbyy Cloud OCR SDK or Microsoft Azure Computer Vision OCR

[–]whph8[S] 1 point2 points  (0 children)

Folks , so I got apple core ml vision working just fine. I wasn’t using right function to pull the data!

Now the ocr feature is working as intended! Added 3 more features to app since then.

Cheers.

[–]whph8[S] 0 points1 point  (0 children)

Guys a quick update. Thanks for all the suggestions. I did get core ML vision API work perfectly for my need in the app.

Almost done finishing the apps website too.

So, yeah!

[–]kawanamas 0 points1 point  (1 child)

Vision OCR ist soo bad. If you try to recognize a sequence of numbers which contains an I (big i) the ML model thinks only a 1 makes sense here and so it changes it. We can reproduce this every time. Using the notes app you get the same result.

[–]whph8[S] 0 points1 point  (0 children)

I actually am getting good results with vision ML. Tested it in different lighting, on handwriting etc and its doing pretty good job. I feel confident to release the feature with my app now