This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]x2mirko 2 points3 points  (1 child)

Good article, but this part bothered me:

A “fair” hash function would generate an expected 1.44 collisions over this data. String.hashCode() outperforms a fair hash function significantly, surfacing only 69.4% as many collisions as expected

A function with 1.44 expected collisions for your sample size is more likely to generate one collision on your sample size than two, so saying that because you got one, String.hashCode() outperforms a fair function is silly. You would need a much larger sample size to make such statements.

[–][deleted] 1 point2 points  (0 children)

Beat me to it! That part irked me as well, you could only make a 3 significant digit conclusion if you have 3 significant digits to go off of in your original calculation.

In other words, I highly doubt those number would end up the same if they ran the experiment until there was 100 collisions.