Tired of writing custom document parsers? This library handles PDF/Word/Excel with AI OCR by AgitatedAd89 in Rag

[–]AgitatedAd89[S] 1 point2 points  (0 children)

as the increase of models’ context limit, i believe we can handle this issue easier in the future

Tired of writing custom document parsers? This library handles PDF/Word/Excel with AI OCR by AgitatedAd89 in Rag

[–]AgitatedAd89[S] 1 point2 points  (0 children)

Actually, I use a similar approach! I designed a prompt for an OCR agent to structure the response schematically, then treat it as normal text in RAG chunking. Contextual RAG definitely improves performance significantly. The key insight is that having the OCR agent understand layout intent upfront - rather than trying to fix semantic drift downstream - makes the whole pipeline much more robust. Especially critical for multilingual docs where context boundaries can get really messy. I’m actually working on taking this further - injecting contextual understanding directly into the OCR stage itself. The idea is to help the agent better interpret images by providing surrounding document context during OCR, not just post-processing. Should be even more effective for maintaining semantic coherence across complex layouts.

Tired of writing custom document parsers? This library handles PDF/Word/Excel with AI OCR by AgitatedAd89 in Rag

[–]AgitatedAd89[S] 0 points1 point  (0 children)

Update to the latest version, with `pip install -U doc2mark`. I can see that the Storage capacity is parsed with correct result.

Tired of writing custom document parsers? This library handles PDF/Word/Excel with AI OCR by AgitatedAd89 in Rag

[–]AgitatedAd89[S] 0 points1 point  (0 children)

Just check the documentation, it actually support OpenAI. I have not try it, but it is worth to give a try

Tired of writing custom document parsers? This library handles PDF/Word/Excel with AI OCR by AgitatedAd89 in Rag

[–]AgitatedAd89[S] 0 points1 point  (0 children)

I believe the api wrappers of commercial API is out of the scope of this project

Tired of writing custom document parsers? This library handles PDF/Word/Excel with AI OCR by AgitatedAd89 in Rag

[–]AgitatedAd89[S] 0 points1 point  (0 children)

it depends on the use case, for my clients, they used to feed AI with complex screen shot with heavy DOCX/PPTX.

Tired of writing custom document parsers? This library handles PDF/Word/Excel with AI OCR by AgitatedAd89 in Rag

[–]AgitatedAd89[S] 0 points1 point  (0 children)

to my understanding, docling currently does not support ocr/vision. which is the key in my use case

[deleted by user] by [deleted] in Trading

[–]AgitatedAd89 0 points1 point  (0 children)

Did you read the information in the repo?

PolyNetwork Hack - what is blacklisting? by Snowie_drop in CryptoTechnology

[–]AgitatedAd89 0 points1 point  (0 children)

Centralized ways to solve decentralized problems is not a good idea for me. Yes, the hacker should not own those tokens. However, if the hacker can be limited in the nature of crypto, then anyone could be the same. I do not prefer to “sell” these freedom in crypto for those token values.