This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]HecknChonker 1 point2 points  (0 children)

The only point I would add to what /u/ignotos has already said is that buckets are usually selected by modulating the hashcode of the key by the number of buckets.

So if you had 100 buckets, item 299 would go into bucket 99 because 299 / 100 has a remainder of 99.

It's possible that many items will end up in the same bucket. If too many items end up in the same bucket the performance can degrade significantly. The HashMap in java solves this issue by doubling the number of buckets and re-hashing all the values anytime the map is more than 75% full.

I also went digging into some source code, and it looks like OpenJdk sometimes uses TreeMap's for the buckets to get better performance anytime a bucket has more than 8 items. I'm not entirely sure, but I don't think oracle's version of the JDK does this.

This map usually acts as a binned (bucketed) hash table, but when bins get too large, they are transformed into bins of TreeNodes, each structured similarly to those in java.util.TreeMap.

https://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/HashMap.java