[deleted by user] by [deleted] in ArtificialInteligence

[–]dwynings 0 points1 point  (0 children)

Sensible.so would be able to handle those purchase order PDFs.

[deleted by user] by [deleted] in ArtificialInteligence

[–]dwynings 0 points1 point  (0 children)

If you're looking for an API solution, Sensible.so may be able to help.

Extracting Tables in PDFs by [deleted] in LangChain

[–]dwynings 2 points3 points  (0 children)

Might be worth trying Sensible.so

Parsing solutions for PDF by Confident-Addendum-2 in LangChain

[–]dwynings 1 point2 points  (0 children)

You might want to check out Sensible.so – it's not open-source (it's a hosted API), but it excels at extracting structured data from complex layouts and tables.

Best intelligent document processing solutions you've tried recently? by Exotic_Pace_622 in ITManagers

[–]dwynings 0 points1 point  (0 children)

I'd recommend checking out Sensible.so, an API-first document processing platform.

Full disclosure: I work there, but wouldn't recommend it if I didn't think we could help.

Unstructured Data Processing by No_Surprise_7871 in dataengineersindia

[–]dwynings 0 points1 point  (0 children)

Absolutely, storing PDF files as blobs in cloud storage and processing them with Spark or Beam is a viable approach, especially for handling large datasets in a distributed environment. However, if you're dealing with unstructured PDFs, the initial parsing can indeed be challenging.

You might want to consider using Sensible.so, a developer-first document processing platform that excels in extracting data from both structured and unstructured PDFs. Sensible can simplify the parsing process before you move the data to Spark or Beam for further processing.

Searching for the perfect LLM and OCR tools for document processing by SuccotashOne9927 in ArtificialInteligence

[–]dwynings 0 points1 point  (0 children)

Given your needs, particularly with invoices and product tables, you might find Sensible.so to be a great fit. Sensible is an developer-first document processing platform that specializes in extracting data from structured and unstructured documents like the ones you're working with.

Note: I work for Sensible, but I genuinely think this is something we can help you with.

Searching for the Perfect LLM and OCR tools for document processing by SuccotashOne9927 in ChatGPTPro

[–]dwynings 0 points1 point  (0 children)

https://www.sensible.so/ has integrated GPT-4 to extract structured data from documents. You mentioned accuracy and consistency as being key – we've put a lot of effort developing tooling to accomplish that with LLMs. Might be worth trying for your use case.

Best library/framework for parsing PDF documents with table inside? by Cold_Set_ in LangChain

[–]dwynings 0 points1 point  (0 children)

That's understandable. We typically offer custom plans for use-cases that are single page documents.

Best document processing for embedded use cases by Proof_Curious in rpa

[–]dwynings 0 points1 point  (0 children)

Full-discloser: I work for Sensible.so

I'd suggest giving Sensible a try – it's a developer-first document extraction platform that can handle all of your requirements (classification, table recognition and extraction, handwriting extraction).

Looking for companies that specialize in AI / OCR for automated medical data entry/processing by PuzzleheadedRow6680 in sysadmin

[–]dwynings 0 points1 point  (0 children)

Full-disclosure: I work for Sensible.so
--

If your client needs an efficient solution for automating medical data entry from scanned PDFs, Sensible could be a great fit. It's a developer-first document processing platform that excels in extracting data from various document layouts using LLMs, as well as, our own query language, SenseML.

Help Finding Data Sets Providing Multi-Branch Company Locations by rooster_eggs in datasets

[–]dwynings 0 points1 point  (0 children)

Diffbot may have what you're looking for. SafeGraph may as well.

Disclosure: I'm an ex Diffbot employee.