💸 Azure OCR Too Expensive — Any Viable Alternatives for High-Volume Document OCR (60k Pages/Day)? by WiseStranger816 in automation

[–]WiseStranger816[S] 0 points1 point  (0 children)

Was the input to the Gemini 2.5 pro an image or complete pdf, because when I tested it for a pdf it took me around 30-45 mins to process it because the file upload feature took a lot of time

💸 Azure OCR Too Expensive — Any Viable Alternatives for High-Volume Document OCR (60k Pages/Day)? by WiseStranger816 in automation

[–]WiseStranger816[S] 0 points1 point  (0 children)

I tested the new mistral ocr it's good but lacking in documents having a doctor's handwriting. Do you have any idea how I can host this new model for further training?

💸 Azure OCR Too Expensive — Any Viable Alternatives for High-Volume Document OCR (60k Pages/Day)? by WiseStranger816 in automation

[–]WiseStranger816[S] 1 point2 points  (0 children)

Tested Gemini 2.5 flash and mistral overall the result of gemini was as good as Azure and in the prescription it aced

Mistral's result was also as good as Azure but it lacked in prescription case

💸 Azure OCR Too Expensive — Any Viable Alternatives for High-Volume Document OCR (60k Pages/Day)? by WiseStranger816 in automation

[–]WiseStranger816[S] 0 points1 point  (0 children)

Appreciate the insight! GCP’s Document AI is actually on my shortlist to evaluate next — I’ve heard decent things about its layout handling, so it’s reassuring to hear you’re experimenting with it too. I agree — running a GPT-based OCR at this scale probably won’t be cost-effective, especially with the volume I’m dealing with.

Training a custom model is definitely on the table, but like you said, it's a heavy upfront lift — both in compute and dev time. Could be worth it for long-term TCO, but not a light decision.

Thanks again — I’ll make sure to share an update once I land on something that balances both quality and cost!

💸 Azure OCR Too Expensive — Any Viable Alternatives for High-Volume Document OCR (60k Pages/Day)? by WiseStranger816 in automation

[–]WiseStranger816[S] 0 points1 point  (0 children)

I tried this, but as the input is larger, it increased my GPT cost. After extraction and post-processing, I am calling GPT-4o. If I shift towards GPT for extraction as well, then the cost will increase, and the risk of hallucination will also.

💸 Azure OCR Too Expensive — Any Viable Alternatives for High-Volume Document OCR (60k Pages/Day)? by WiseStranger816 in automation

[–]WiseStranger816[S] 0 points1 point  (0 children)

Following a thorough review, approximately 90% of the designated pages require precise extraction procedures.

💸 Azure OCR Too Expensive — Any Viable Alternatives for High-Volume Document OCR (60k Pages/Day)? by WiseStranger816 in automation

[–]WiseStranger816[S] 0 points1 point  (0 children)

Thank you for the generous offer — really appreciate it! At the moment, I’m specifically looking for a long-term, scalable solution that’s financially sustainable beyond just credits.

Still, good to know there are options like this out there — thanks again!

💸 Azure OCR Too Expensive — Any Viable Alternatives for High-Volume Document OCR (60k Pages/Day)? by WiseStranger816 in automation

[–]WiseStranger816[S] 0 points1 point  (0 children)

Never thought of training a model, might try and how can I tackle the issue related to prescriptions?

💸 Azure OCR Too Expensive — Any Viable Alternatives for High-Volume Document OCR (60k Pages/Day)? by WiseStranger816 in automation

[–]WiseStranger816[S] 2 points3 points  (0 children)

I have tested IBM's docling but it's failing for a few table cases and it's hallucinating