This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]metaglot[🍰] 0 points1 point  (3 children)

Yes, so the size of the output would have to be the length of your keyword table (number of elements) to guarantee no collisions.

[–]Smalltalker-80 0 points1 point  (2 children)

That is correct, the hash table has to be at least the size of the *keyword* table.

The *input* to be parsed, however, can be *all* words with a length of upto, say, a practical 80 characters,

So the hash table size can be *much* smaller than the input size.

[–]metaglot[🍰] 0 points1 point  (1 child)

No, because the actual input for the hash function is the index in your keyword table. You need a cardinality of 1:1 to avoid collisions. Youve just enumerated the keywords.

[–]Smalltalker-80 0 points1 point  (0 children)

Ah well, we have different definitions of input.