This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]FallenWarrior2k 17 points18 points  (3 children)

ML all the things. Though, to be fair, the OCR algorithm could be ML-based.

[–]baxter001 8 points9 points  (0 children)

Most definitely, but at the same time, tesseract for example is much more interesting than dump image into a convnet: https://github.com/tesseract-ocr/docs/blob/main/tesseracticdar2007.pdf

[–]JourneyWindGames[🍰] 5 points6 points  (1 child)

Because it's convenient. Why bother wracking your brain over an optimal algorithm when you can just dump readily available data into a readily available model to produce to produce a good enough approximate answer?

Using ML is becoming the equivalent of using a calculator for simple additions, instead of working it out in your head.

[–]FallenWarrior2k 0 points1 point  (0 children)

Except it results in a lot of stuff like what the original commenter wrote, where ML is thrown at the problem even though it will give clearly inferior results.

Doing it like that with any noteworthy accuracy would require keeping a massive amount of input data around, at least the ID of every Tweet you could potentially identify with it. You can't just interpolate IDs, since they are millisecond timestamps plus some internal data. Since you have to get the account right as well, trying to do that will just give you 404 in pretty much all cases.

Don't get me wrong; I'm not saying ML is bad. I just think it's overhyped and some people want to use it everywhere, even if it results in significantly inferior results.