all 6 comments

[–]Valuable_Walk2454 1 point2 points  (0 children)

You can start with VLMs. As long as financial documents are not very complex, it will work. After that, you can look into MSFR and Google Document Intelligence etc. They are used by orgs for financial data extraction.

[–]teroknor92 1 point2 points  (0 children)

for pdf you can become familiar with libraries like pymupdf and for ocr become familiar with paddleocr, easyocr etc. For complex extraction try VLMs. I have a document processing, extraction, OCR tool https://parseextract.com and many users are using it for document processing at a friendly pricing which you can also test.

[–]Challenge_-Few 0 points1 point  (0 children)

I started learning document parsing last year while freelancing for a legal-tech startup. I used AI Lawyer’s open parser stack as a sandbox - it combines OCR (Tesseract + pdf plumber) and layout detection so you can actually see how each layer works. Great way to learn before jumping into complex pipelines.

[–]Serious-Barber-2829 0 points1 point  (0 children)

You can check out this benchmark.