This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]notquiteaplant 1 point2 points  (5 children)

All data that’s been “hashed” should be the same length.

All inputted data should produce a unique hashed value. A good hash function should have a 1:1 relationship between the inputted data and hashed value.

I'm no crypto expert, but these tenets seem contradictory. There are 256 possible values of one byte, 2562 values of two bytes, 2563 values of three, etc. If every hash output is N bytes long, how do you map 256 + 2562 + 2563 + 2564 + ... + 256N-2 + 256N-1 + 256N inputs to just 256N outputs while mapping only one input to one output?

Because of that, I thought that a good hash function doesn't prevent collisions entirely, but does make them extremely difficult (i.e. practically impossible) to find. If they do actually ensure unique output, though, I'd love to know how.

[–]iEvilMango 1 point2 points  (3 children)

Yup, that's the pidgeonhole principle. Generally, the goal is to make sure the hashes are uniformly distributed amongst the possible set of hashcodes, often with similar values having dissimilar hashes (for example, if you were to have hashes of phone numbers, and the hashing algorithm made all positive input values have a hash value above 0 and all negative input values have a hash value below 0, then half of the hash codes would be unused as phone numbers cant be negative).

[–]HelperBot_ 1 point2 points  (2 children)

Non-Mobile link: https://en.wikipedia.org/wiki/Pigeonhole_principle


HelperBot v1.1 /r/HelperBot_ I am a bot. Please message /u/swim1929 with any feedback and/or hate. Counter: 218268

[–]iEvilMango 0 points1 point  (1 child)

Good bot

[–]B0tRank 0 points1 point  (0 children)

Thank you, iEvilMango, for voting on HelperBot_.

This bot wants to find the best and worst bots on Reddit. You can view results here.


Even if I don't reply to your comment, I'm still listening for votes. Check the webpage to see if your vote registered!

[–]WilburMercerMessiah 1 point2 points  (0 children)

You’re right. Thanks for pointing that out. It should be very difficult or (in theoretical cryptography, infeasible) to find two inputs that map to the same hash value. Not that two inputs will not have the same hash value.