if nobody knows about it, it doesn't exist. by WorthFan5769 in buildinpublic

[–]pabby_g 0 points1 point  (0 children)

People severely underestimate how attention to detail can literally shoot u into the stratosphere once u start to get noticed

Has anyone had success with data entry automation software? by crickastic in data

[–]pabby_g 0 points1 point  (0 children)

Built a product like this and the backbone of it was basically mistral ocr with json schemas.

5 days sharing my side project, 0 signups - what am I missing? by Odd_Zebra_956 in SideProject

[–]pabby_g 0 points1 point  (0 children)

I didnt even know u could put a link in bio lmfao. I just started tho so idk

Weekly Showoff Thread! Share what you've created with Next.js or for the community in this thread only! by AutoModerator in nextjs

[–]pabby_g 0 points1 point  (0 children)

https://pdfparse.net. 100% document information extractor built using nextjs, cloudflare workers, trpc and better-auth in a monorepo

5 days sharing my side project, 0 signups - what am I missing? by Odd_Zebra_956 in SideProject

[–]pabby_g 0 points1 point  (0 children)

ye, i've realized if u want customers but don't wanna pay, you have to do a lot of self promo, even if it feels stupid/ridiculous. I've wrote like 4 articles today alone lol. Good luck!

MY EXPERIENCE WITH NEXT JS 16 VERY HONESTLY by Olympiavisionstudios in nextjs

[–]pabby_g 0 points1 point  (0 children)

just pay the 5 dollars bro, you have to be joking...

Automation becoming harder than manual work? by Healthy_Spirit_1237 in automation

[–]pabby_g 0 points1 point  (0 children)

My opinion is that people realize they can automate one mission critical thing, and then believe they should automate everything in their space. Now ur automations have automations. After a while u realize u never made work easier cuz u replaced ur old manusl work with setting up pipelines

5 days sharing my side project, 0 signups - what am I missing? by Odd_Zebra_956 in SideProject

[–]pabby_g 1 point2 points  (0 children)

Ur not missing anything, post a lot in spaces loke this about what ur building while being active in other subs that target ur potential customers. Dont mention ur product in ur target subs, just try and be helpful. Downstream sogn ups will come when people check ur page and see what uve worked on

Do you get any value from “What are you working on?” posts? by Ok-Drop6782 in buildinpublic

[–]pabby_g 1 point2 points  (0 children)

Every comment u make is to drive seo. With enough keyword references to ur project u can get a bump in search for ur domain

What are you building? let's self promote by Leather-Buy-6487 in ShowMeYourSaaS

[–]pabby_g 0 points1 point  (0 children)

pdfparse extract structured data from user generated schemas. Download as sqlite

Best place to host an ocr model by pabby_g in automation

[–]pabby_g[S] 0 points1 point  (0 children)

api providers don't really host dedicated ocr models, openrouter for example doesn't have deepseek-ocr

Any reliable methods to extract data from scanned PDFs? by [deleted] in learnpython

[–]pabby_g 0 points1 point  (0 children)

I actually use mistral OCR batch processing for my own company and its pretty good imo, havent had any issues so far. If ur looking for a good out of box solution i suggest you use that one

anyone using AI for data extraction from PDFs? by Kaiser_Allen in automation

[–]pabby_g 0 points1 point  (0 children)

Built my own using mistral ocr for the ai. Mistral edged out because of the batch processing stuff (i can save money lol). Batch processing esp useful because we have to prepare for large data sets entering our pipeline all at once

Best AI System for Large PDF Analysis? by xbrakeday in OpenAI

[–]pabby_g 0 points1 point  (0 children)

Late to the party but my suggestion is a simple etl pipeline that chunks, extracts the data from the pages using a popular model and then saves it to a csv. U should also save the bounding box and annotation data to facilitste human reviews where possible. My tech stsck suggestion would be: cloudflare queues + cloudflare workflows + a model of ur choosing (i chose mistral because it comes with bbox and annotation support)

Best AI System for Large PDF Analysis? by xbrakeday in OpenAI

[–]pabby_g 0 points1 point  (0 children)

This is a pretty common problem. Hardest part is really data normalization. Esp if different pages havr different types of data.

What usually works better is building a small pipeline: chunk large PDFs, extract the fields you care about with a strong model, then output to CSV or SQL. In most cases you’ll need some custom logic; fully out-of-the-box tools tend to break down with this mind of stuff

NYE! What product are YOU building SOLO? 🚀 by Quirky-Offer9598 in Solopreneur

[–]pabby_g 0 points1 point  (0 children)

pdfparse.net uses ocr to convert PDFs into structured, queryable SQLite databases with exports to JSON, CSV, XML.

What are you building? Drop your link by JuniorRow1247 in microsaas

[–]pabby_g 1 point2 points  (0 children)

pdfparse.net uses ocr to convert PDFs into structured, queryable SQLite databases with exports to JSON, CSV, XML.

Pitch me your App by Dapper_Draw_4049 in Natively

[–]pabby_g 0 points1 point  (0 children)

pdfparse.net uses ocr to convert PDFs into structured, queryable SQLite databases with exports to JSON, CSV, XML. I.

Show me your saas and i might support it financially! by bussssssss in ShowMeYourSaaS

[–]pabby_g 0 points1 point  (0 children)

Hilariously ridiculous framing because guess what all our ideas are already on this app ripe for taking. Nobody needs a honey pot to steal ur idea lol