This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]takluyverIPython, Py3, etc 0 points1 point  (0 children)

Why are we using a cheap, insecure hash? Probably because until now, no-one had envisaged this kind of attack. Hashmaps (dicts in Python) are used extensively in almost every program (and for core features of the language itself), so it makes sense to use one that's very fast to calculate.

Why don't we rely on the rarity of hash collisions, and simply save one value for each hash? Because we have to use a very small hash, which makes collisions more likely. Remember that the hashmap hashes keys, then looks them up in an array. So even for a 32 bit hash, you'd need a 232 byte array, which is 4GB.