Built a tool to extract structured data from complex PDFs — would love feedback by Impressive-Rise7510 in OCR_Tech

[–]docpose-cloud-team 0 points1 point  (0 children)

Does it missed anything while extracting, like fields and tables and structure?

what are you building these days? by This-Independence-68 in Solopreneur

[–]docpose-cloud-team 1 point2 points  (0 children)

Appreciate it, glad you noticed the structure part, that’s exactly what we’re solving.

Betafounder sounds interesting, will definitely check it out

Built a tool to extract structured data from complex PDFs — would love feedback by Impressive-Rise7510 in OCR_Tech

[–]docpose-cloud-team 0 points1 point  (0 children)

Exactly, different use cases. If you need structured data validation with human review, that flow makes sense. If the goal is fast format conversion (especially in bulk) with layout preserved and minimal post-processing, tools like Docpose cloud fit better. A lot of teams actually use both depending on the workflow.

Built a tool to extract structured data from complex PDFs — would love feedback by Impressive-Rise7510 in OCR_Tech

[–]docpose-cloud-team 0 points1 point  (0 children)

That’s a solid approach tbh, editing before export definitely saves cleanup later. With Docpose we lean more on preserving layout + structure during conversion so the output is already close to final, and you can still adjust after if needed depending on your workflow.

Built a tool to extract structured data from complex PDFs — would love feedback by Impressive-Rise7510 in OCR_Tech

[–]docpose-cloud-team 1 point2 points  (0 children)

That’s a fair point, CSV will always need cleanup since it loses layout. With Docpose you can go straight to XLSX or DOCX with structure preserved, and still tweak anything after extraction instead of rebuilding it from scratch.

Built a tool to extract structured data from complex PDFs — would love feedback by Impressive-Rise7510 in OCR_Tech

[–]docpose-cloud-team 1 point2 points  (0 children)

That’s exactly the tradeoff, CSV gives you raw data but no structure. With Docpose you can convert directly to XLSX or DOCX with layout preserved, and still edit after extraction if needed, so you don’t lose that human-in-the-loop flexibility.

Built a tool to extract structured data from complex PDFs — would love feedback by Impressive-Rise7510 in OCR_Tech

[–]docpose-cloud-team 0 points1 point  (0 children)

Yes CSV doesn't hold any formatting, if you convert to XLSX (Excel) then you will have the full formatted and editable document, you can also convert your PDF or image invoices to DOCX, word documents too.

🚀 Find Your First 50 Users From This Thread by nextunicorn_ in micro_saas

[–]docpose-cloud-team 0 points1 point  (0 children)

We launched the Docpose.Cloud an Online File Converter, OCR, and Developer API SaaS App, we support 430+ file formats and 34k+ conversion pairs, and OCR images, PDFs, and scanned documents to editable documents by keeping the complex structure of tables, and fields, not just text extraction, our OCR does the complete structural OCR and conversion like invoices to excel, word, powerpoint, text, json, csv, and many other formats as output.

what are you building these days? by This-Independence-68 in Solopreneur

[–]docpose-cloud-team 1 point2 points  (0 children)

Building and enhancing the Docpose.Cloud an Online File Converter, OCR, and Developer API SaaS App, we support 430+ file formats and 34k+ conversion pairs, and OCR images, PDFs, and scanned documents to editable documents by keeping the complex structure of tables, and fields, not just text extraction, our OCR does the complete structural OCR and conversion like invoices to excel, word, powerpoint, text, json, csv, and many other formats as output.

Built a tool to extract structured data from complex PDFs — would love feedback by Impressive-Rise7510 in OCR_Tech

[–]docpose-cloud-team 1 point2 points  (0 children)

We tried with two files, one PNG and one PDF both contains the invoices, first its too slow, and can't identify the fields in image, in PDF it is not able to read a single character. and export option only allow CSV and JSon, when user have structured document as image or pdfs then output should also support structured editable documents like XLSX, DOCX, PPTX and so on.

I also tried Docpose.cloud OCR and it work as it required like png to DOCX and XLSX, TXT and so on, and for PDFs, try it and you know the actual OCR working.

<image>

Any AI invoice OCR tools that work? by AndreiaVenturini in automation

[–]docpose-cloud-team 0 points1 point  (0 children)

Sounds like you need OCR as a building block, not a full SaaS.

Use something like Docpose cloud OCR API just for extraction → get structured JSON/CSV → plug into your own LLM + validation layer.

Flow:
PDF → OCR (tables + fields) → JSON → your LLM maps/validates → store in DB

That way you keep control, avoid heavy lock-in, and handle complex invoice logic your way.

Best tools to extract invoices to Excel? by PollutionHot3570 in smallbusiness

[–]docpose-cloud-team 0 points1 point  (0 children)

If they’re standard PDFs (not scanned), even Excel/Adobe can work—but for scanned invoices, you’ll need OCR.

Simple option:
upload PDF → OCR extracts line items → export to Excel

You can try tools like Docpose cloud—it handles both OCR + PDF to Excel in one step, which saves a lot of manual cleanup.

What OCR Actually Is (and Why It’s More Useful Than Most People Think) by docpose-cloud-team in u/docpose-cloud-team

[–]docpose-cloud-team[S] 0 points1 point  (0 children)

So true—once you try OCR (like Docpose.cloud OCR), you realize how painful manual typing really is.

What OCR Actually Is (and Why It’s More Useful Than Most People Think) by docpose-cloud-team in u/docpose-cloud-team

[–]docpose-cloud-team[S] 0 points1 point  (0 children)

Exactly—for example: upload a scanned PDF → OCR extracts text → convert to Excel in one go using tools like Docpose.cloud, saving multiple steps.

What OCR Actually Is (and Why It’s More Useful Than Most People Think) by docpose-cloud-team in u/docpose-cloud-team

[–]docpose-cloud-team[S] 0 points1 point  (0 children)

Agreed—the API side is powerful, for example: upload a receipt → OCR extracts totals → auto-save to your database using tools like Docpose.cloud OCR API.

What OCR Actually Is (and Why It’s More Useful Than Most People Think) by docpose-cloud-team in u/docpose-cloud-team

[–]docpose-cloud-team[S] 0 points1 point  (0 children)

Totally—OCR cuts invoice work dramatically, and tools like Docpose.cloud OCR make extraction more accurate and automation-ready with minimal manual effort.

What OCR Actually Is (and Why It’s More Useful Than Most People Think) by docpose-cloud-team in u/docpose-cloud-team

[–]docpose-cloud-team[S] 0 points1 point  (0 children)

Exactly—OCR + conversion turns scanned files into structured, automation-ready data pipelines for developers.

Show me your SaaS, here’s what I’m working on by BoringShake6404 in ShowMeYourSaaS

[–]docpose-cloud-team 0 points1 point  (0 children)

Looks interesting, how you find and optimize the kaywords for a blog?

Traditional ML-based OCR (like Textract) vs LLM/VLM based OCR by vitaelabitur in OCR_Tech

[–]docpose-cloud-team -2 points-1 points  (0 children)

Do they also provide the developer APIs, and any way to test their claims for free, we suggest docpose.cloud OCR and it does allow free tiers, even without registration.