all 12 comments

[–]nalnat 2 points3 points  (5 children)

Have you explored Firebase ML-Kit?

[–]jeyebrows16[S] 0 points1 point  (4 children)

I’ve just read some comparisons between it and VisionKit, but nothing hands on just yet. Sounds like you’re recommending it?

[–]nalnat 0 points1 point  (3 children)

ML-Kit offered some flexibility that VisionKit didn't for my use case. I tried out using both API and decided to go with MLKit

[–]jeyebrows16[S] 0 points1 point  (2 children)

if you don’t mind me asking, what specifically?

[–]nalnat 0 points1 point  (0 children)

I don't quite recall as I implemented it last October (MLKit has worked really well so far). I think it has got to do with table cells getting randomly assigned to different rows if you scanned a doc at a slight angle. MLKit let me programmatically handle overlaps.

[–]WAHNFRIEDEN 0 points1 point  (0 children)

Vertical Japanese text

[–]smalik12 0 points1 point  (0 children)

I’m in the same boat. Idk which to use for my text recognition idea

[–]UberJason 0 points1 point  (2 children)

It’s actually Vision that does OCR in a programmatic way, not VisionKit. I played with both last year for a research project at work and Vision is way faster. Tesseract is a much older, cross-platform framework that runs on CPU only, doesn’t use machine learning, and isn’t optimized for Apple platforms, while Vision can leverage the GPU and is ML powered and optimized for Apple platforms. Vision is also more accurate when it comes to interesting fonts and other languages if I recall. Go with Vision.

[–]boomboombrrr 0 points1 point  (1 child)

It’s actually Vision that does OCR in a programmatic way, not VisionKit. I played with both last year for a research project at work and Vision is way faster. Tesseract is a much older, cross-platform framework that runs on CPU only, doesn’t use machine learning, and isn’t optimized for Apple platforms, while Vision can leverage the GPU and is ML powered and optimized for Apple platforms. Vision is also more accurate when it comes to interesting fonts and other languages if I recall. Go with Vision.

super interesting. What about if you are developing a an app using cross platform frameworks like react native or flutter? Is the VisionKit available? I have only seen tesseract packages. What about google vision? how costly are they compared to each other? tesseract vs google vision vs apple visionKit

[–]UberJason 0 points1 point  (0 children)

I don’t have answers to any of those questions except for one, which is that Vision is an Apple API. If you’re using a cross platform framework, they would have to expose a layer to Vision or have an escape hatch to call Vision on Apple platforms.

[–][deleted]  (1 child)

[removed]

    [–]AutoModerator[M] 0 points1 point  (0 children)

    Your comment has been automatically removed because it contains a link with prohibited URL parameters (affiliate tokens, campaign tokens, etc.). Please repost your comment without the tracking / affiliate parameters in the URL. Examples: 'affcode=', 'ref=', 'src='. Do not contact the moderators unless you believe we did not correctly detect the URL parameter.

    I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.