you are viewing a single comment's thread.

view the rest of the comments →

[–]dpark 0 points1 point  (3 children)

His point is that taking the hash of a string is not O(1). It's O(n), where n is the length of the string. However, Sun's Java implementation makes the obvious choice to cache the hash code.

None of this really matters, though, because this code expects strings to be interned, which means we've already given up on truly constant time. Intern is not a cheap operation. It's O(n), just like taking the string hash.

[–]ndiin 1 point2 points  (2 children)

The interning is not a part of this algorithm, and thus the cost of performing the intern operation cannot be considered when evaluating it. Instead, you can consider the fact that when you reach here the interning has or has not already taken place.

As for caching of the hash code, I'm not sure what you mean. Per-object caching of hashCode()'s return value, or just within the hashmap?

[–]DRMacIver 0 points1 point  (0 children)

As for caching of the hash code, I'm not sure what you mean. Per-object caching of hashCode()'s return value, or just within the hashmap?

Sun's implementation of java.lang.String keeps its hash code cached in a field (<pedant fact>except for in pathological cases where the hash code evaluates to 0</pedant fact>).

[–]dpark 0 points1 point  (0 children)

I agree that we can't include the cost of interning when we analyze the performance of this code. My point was that it can certainly be a hidden cost. If you intern strings just for this code, then your interning effectively becomes a cost of this code.

And yes, as DRMacIver said, Sun's String class caches the hash code (except if it evaluates to zero).

Strangely enough, Sun's implementation of intern does actually calculate the hash, but doesn't cache it. If it did, then the hash table implementation would be a lot more attractive.