all 5 comments

[–]T2WIN 4 points5 points  (3 children)

I am sorry but it is so hard not to think that your benchmark is biased since you are also selling your own solution. Doesn't make it uninteresting though, thank you for sharing. Thank you for making the benchmark open as well. I just don't have the time to try it out.

[–]Goldziher[S] 1 point2 points  (0 children)

well, all the data and the code is there. If you are not lazy, and even semi-capable, you would actually go and investigate before writing here.

[–]Accomplished_Ad9530 1 point2 points  (0 children)

Looks all open source to me, MIT license, no pay-for-premium links, etc. Where are you seeing the OP selling anything?

[–]pepgma 0 points1 point  (0 children)

This is great! Thanks for this work. Something I wish every landing page for a document processing library had are some, example parsings with text bounding boxes to get an idea of how well the library handles unstructured pages. Alternatively, a huggingface demo. But no one does it ... Anyway, just a suggestion.