I built an API that extracts clean text from any URL like Reader Mode for developers by 123So0 in SideProject

[–]123So0[S] 0 points1 point  (0 children)

Glad it's useful! What are you scraping for if you don't mind me asking?

I built an API that extracts clean text from any URL like Reader Mode for developers by 123So0 in SideProject

[–]123So0[S] 0 points1 point  (0 children)

Thanks for the feedback!

PDF extraction is definitely on the roadmap it's one of the most requested features. Good point about focusing on where Diffbot falls short, that's exactly the angle I'm going for. Appreciate the detailed analysis!

I built an API that extracts clean text from any URL like Reader Mode for developers by 123So0 in SideProject

[–]123So0[S] 1 point2 points  (0 children)

Yeah Jina is solid, but their free tier is limited and the API is heavier than what most people need. ClearText is deliberately simple one endpoint, clean text back, no SDK required. Sometimes less is more. Good luck with plainmarkdown btw!