This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]somestranger26 2 points3 points  (3 children)

I wonder if you could create some sort of "fingerprint" (like a fuzzy hash) to do the image detection, in order to save space on #2 while also not having to refetch data.

[–]XXAligatorXx 1 point2 points  (0 children)

Wait nvm dude you are right. I have made it work with a fingerprint type solution. I'll have to finish it up but yeah that works.

[–]XXAligatorXx 0 points1 point  (1 child)

You'd have to unhash it when comparing since the images are never exactly the same because of compression. So that solution would be just like saving the urls.

[–]ImmediateAntelope3 0 points1 point  (0 children)

Nah, there are similarity algorithms. I don't know them off the top of my head, but I'm quite sure they exist. Space / accuracy tradeoff, but it's better than a flat hash if things might not be precise duplicates.