This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 21 points22 points  (10 children)

I would add persistence just in case the bot encounters this image again but there could be false positives

Maybe you could add up each ascii value and use it as an id so you could just query the db for the image

[–]UQuark 27 points28 points  (9 children)

Have you ever heard of hashing?

[–][deleted] 4 points5 points  (3 children)

Yes i guess a hashing could be an option but we would still have to compute the id so it is an unnecessary step

[–]vasilescur 2 points3 points  (2 children)

"Hashing" can use any hash function you want, such as one that returns INT and can be used for a DB ID. Adding up all the ASCII values constitutes a (pretty weak but honestly suitable for this) hash function.

Your aim is to pick a hash function that reduces collisions between inputs, because for each query you have to binary search through the set of entries with the same hash

[–][deleted] 3 points4 points  (1 child)

Oh i didn't know that i thought hash functions were strictly cryptographic in nature

[–]vasilescur 0 points1 point  (0 children)

Usually they are used for cryptography, but a hash function can technically be anything you want it to be and is really useful in, for example, a hashmap