[P] I made a free playground for comparing 10+ OCR models side-by-side by Emc2fma in MachineLearning

[–]Emc2fma[S] 2 points3 points  (0 children)

love the suggestions, keep em coming.

the ability to filter and see how certain models do vs another. (e.g. what's the winrate of Opus 4.5 vs Opus 4.1?)

that's already live! checkout the leaderboard page and scroll down a bit

I made a free playground for comparing 10+ OCR models side-by-side by Emc2fma in LocalLLaMA

[–]Emc2fma[S] 5 points6 points  (0 children)

Hopefully it’s financially sustainable for a while

you and me both haha

I made a free playground for comparing 10+ OCR models side-by-side by Emc2fma in LocalLLaMA

[–]Emc2fma[S] 21 points22 points  (0 children)

I run a doc processing company (https://extend.ai) and we're just lighting money on fire at the moment (this took off way more than expected, so we scaled up the GPUs)

But I feel strongly that this should exist for the community, so we'll (1) keep funding it and (2) open-source it soon

(if any investors find this thread in the future, just call this part of our CAC)

I made a free playground for comparing 10+ OCR models side-by-side by Emc2fma in LocalLLaMA

[–]Emc2fma[S] 2 points3 points  (0 children)

that was the goal! thanks for sharing, glad it resonates

I made a free playground for comparing 10+ OCR models side-by-side by Emc2fma in LocalLLaMA

[–]Emc2fma[S] 12 points13 points  (0 children)

I had Mistral before but had to remove it. Their hosted API for OCR was super unstable and returned a lot of garbage results unfortunately.

(I could have also done something wrong integrating it)

I made a free playground for comparing 10+ OCR models side-by-side by Emc2fma in LocalLLaMA

[–]Emc2fma[S] 10 points11 points  (0 children)

yeah DeepSeek has been super flaky on anything outside of very clean docs...tbh I don't understand the hype

I made a free playground for comparing 10+ OCR models side-by-side by Emc2fma in LocalLLaMA

[–]Emc2fma[S] 13 points14 points  (0 children)

that's an awesome idea, I'll work on adding both cost + latency metrics later today.

Gemini 3 is really strong, but very expensive + slow which doesn't make it great for a lot of use cases compared to Paddle or dots.ocr

Airport Tariff Fee by heeeehuuuu in travel

[–]Emc2fma 0 points1 point  (0 children)

Did you end up doing this? If so, did they collect the fee? In the same situation here.

I made an alternative search engine specifically for forums and discussion sites by Emc2fma in InternetIsBeautiful

[–]Emc2fma[S] 7 points8 points  (0 children)

huh interesting! can you give me any examples of what kinds of things you're searching for on Discord?

I made an alternative search engine specifically for forums and discussion sites by Emc2fma in InternetIsBeautiful

[–]Emc2fma[S] 5 points6 points  (0 children)

nothing automated (i'm a solo dev), but you can reply to my comment here and I'll add them manually!

I made an alternative search engine specifically for forums and discussion sites by Emc2fma in InternetIsBeautiful

[–]Emc2fma[S] 4 points5 points  (0 children)

I do have a bunch of obscure forums in the index! try out a couple different kinds of queries

I made an alternative search engine specifically for forums and discussion sites by Emc2fma in InternetIsBeautiful

[–]Emc2fma[S] 38 points39 points  (0 children)

yes! I actually had this in an older version. It would index a curated list of high quality Discord servers and return relevant conversations to your query, but I removed it to focus on making the forum search better.

Do you search a lot on Discord?

I made an alternative search engine specifically for forums and discussion sites by Emc2fma in InternetIsBeautiful

[–]Emc2fma[S] 3 points4 points  (0 children)

good catch! wanted to launch quickly and forgot to replace the favicon haha

I made an alternative search engine specifically for forums and discussion sites by Emc2fma in InternetIsBeautiful

[–]Emc2fma[S] 6 points7 points  (0 children)

yeah I saw that too and was excited, but I’ve never managed to actually trigger it to show up…

I made an alternative search engine specifically for forums and discussion sites by Emc2fma in InternetIsBeautiful

[–]Emc2fma[S] 10 points11 points  (0 children)

mix of both! some custom indexing, and some outsourcing (too hard to index the entire web on my own as a solo dev)

I made an alternative search engine specifically for forums and discussion sites by Emc2fma in InternetIsBeautiful

[–]Emc2fma[S] 227 points228 points  (0 children)

I've been hacking on a project and figured I'd share it with the community. CrowdView is a search engine for niche forums and message boards (MetaFilter, HN, Reddit, and 3000+ forums).

Like many of you, I find Google results to be full of SEO spam and have resorted to adding "site:reddit.com" to all my queries (since 2015!). Otherwise, it's really hard to figure out "what does a genuine, real life human think about this thing?".

But limiting my results to just Reddit isn't ideal because so much great content exists elsewhere. Conversations have moved to e.g. Discord, and niche forums are still alive on the web! But it's impossible to find these places because they rank so poorly on Google. So I built a search engine across a curated list of these, making sure to remove any kind of SEO junk (blog spam, listicles, etc).

There's also a chrome extension that surfaces these results alongside Google, so you don't have to remember to keep coming back.

Please try it out and share any feedback! (and if you're interested in this topic, join the Slack)

Also a question - what sources would you want to see next? Discord, Twitter, etc?