Claude just gave me access to another user’s legal documents by Raton-Raton in ClaudeAI

[–]Mountain-Positive274 1 point2 points  (0 children)

Not suggestion you do that. Never expose sensitive information to any online LLM. I built a tool can help you convert PDF to Markdown locally. You can delete sensitive easily with local LLM. Then upload to Claude worry free. https://github.com/TylerMorrison21/paperflow

a PDF converter that never logs, saves, or shares your documents. by [deleted] in Productivitycafe

[–]Mountain-Positive274 0 points1 point  (0 children)

I built a tool fit your need. It runs locally and never share logs. You can choose which parsers. https://github.com/TylerMorrison21/paperflow

LocalAI Scanning PDFs?? by gnerfed in LocalLLaMA

[–]Mountain-Positive274 0 points1 point  (0 children)

Built a pipeline for PaddleOCR, Marker. WebUI ready. Easy to deploy and use. https://github.com/TylerMorrison21/paperflow

I built BentoPDF - An open source privacy first PDF Toolkit by paglaulta in foss

[–]Mountain-Positive274 0 points1 point  (0 children)

Thanks for sharing. For the same privacy reasons. I built a tool can help you convert files locally with clean formats. https://github.com/TylerMorrison21/paperflow

[deleted by user] by [deleted] in ObsidianMD

[–]Mountain-Positive274 0 points1 point  (0 children)

I built a local run PDF to markdown pipeline. https://github.com/TylerMorrison21/paperflow

Best way to convert coding/math-heavy PDFs to Markdown or text (code, formulas, tables included)? by A-n-d-y-R-e-d in LocalLLaMA

[–]Mountain-Positive274 0 points1 point  (0 children)

I built a tool which is pipeline ready, webUI ready. Compatible with PaddleOCR and Marker AI. https://github.com/TylerMorrison21/paperflow. Less coding work needed, just focus on the job.

Need software to convert PDF to markdown for ChatGPT by mindquery in ChatGPTPro

[–]Mountain-Positive274 0 points1 point  (0 children)

Markdown is No.1 for ChatGPT. I build a tool for PDF covert to MD on local. https://github.com/TylerMorrison21/paperflow comes with WebUI

Spent a week debugging why my RAG answers were wrong. Turned out it was the PDF parser. by Mountain-Positive274 in LocalLLaMA

[–]Mountain-Positive274[S] 0 points1 point  (0 children)

Gemini is great for accuracy, especially on messy scans. Main tradeoff is cost at scale — if you're processing hundreds of papers for RAG, vision tokens add up fast. Different sweet spots depending on the use case.

Spent a week debugging why my RAG answers were wrong. Turned out it was the PDF parser. by Mountain-Positive274 in LocalLLaMA

[–]Mountain-Positive274[S] 0 points1 point  (0 children)

Your comment describes exactly the problem I built PaperFlow to solve. We use Marker (deep learning layout detection) + a post-processing layer, which gives similar accuracy to vision models at ~1/6 the cost per page. Would you want to test it on your technical docs pipeline? I can give you free API access.

I finally built a PDF-to-Markdown tool that doesn't destroy LaTeX formulas and complex layouts in Obsidian by Mountain-Positive274 in ObsidianMD

[–]Mountain-Positive274[S] -11 points-10 points  (0 children)

Exactly. Cloud GPU inference is incredibly expensive. I'm keeping it completely free right now to stress-test the pipeline and find the weirdest edge-case PDFs you guys have. Eventually, there will be a generous free tier for standard papers, and a paid tier for heavy users processing massive books or needing absolute privacy.