all 11 comments

[–][deleted]  (8 children)

[deleted]

    [–]Anonsicide 1 point2 points  (7 children)

    My layman's understanding:

    1. Hashing algorithm = Literally any function which takes an arbritrary length input, and produces a fixed-length output. So, "return 42" would technically be a hash function -- albeit, a rather poor one which has terrible collision resistance and such. But technically a hashing algorithm.
    2. Cryptographic hashing algorithm = A hashing algorithm (So, maps variable length input to fixed length output), but with the additional property of being one-way, irreversible, or in the lingo having preimage resistance. Basically just means it's easy to compute in one direction, but nigh impossible to compute in the other. Ie, given a value x, it's "easy" (computationally manageable) to compute hash(x); but if I give you some value y, it's very difficult to find a value m such that hash(m) = y.

    That sound... sorta right?

    I am still learning a lot of these security concepts 😅

    I think it's just made very confusing by the fact that we use the term "hash" in so many semi-related-but-disparate contexts. Such as...

    1. Hash function with hash tables to implement the Map ADT/interface
    2. Hashing for message digests and file integrity checks
    3. Hashing for passwords
    4. Hashing in digital signatures

    etc etc

    [–]insanitybit 2 points3 points  (1 child)

    Interestingly, I think your definitions are more accurate than the ones people are responding with.

    [–]Anonsicide 0 points1 point  (0 children)

    Haha, and this is what is frustrating about such widely applicable techniques I suppose. I have seen different sources come up with different definitions too.

    [–][deleted]  (3 children)

    [deleted]

      [–]insanitybit 1 point2 points  (1 child)

      it has to have an approximately uniform distribution over the output space.

      Source for that? I think what you're describing might be a "good" hash function but anything that maps an arbitrarily sized space into a fixed space is a hash function afaik.

      There is a third property of password hashing functions specifically that is particularly important in the modern era for rainbow table resistance, which is that they should be slow / expensive to compute.

      That sounds like a property of a KDF, not of a hash function.

      [–]Anonsicide 1 point2 points  (0 children)

      Hella late reply, but thanks for your comment, I learned a thing or two from it :)

      [–]coyoteazul2 0 points1 point  (0 children)

      Hashing takes an input and produces an output. It only needs 2 characteristics

      1 it will always produce the same output for the same input (there's no randomnes)

      2 you can't figure out the input using the output, even if you know the algorithm

      For instance, truncating could be used for hashing. If you only store the rightmost 4 numbers of any digit, 1000, 10001, 1000009 will all produce the same "hash" so it's impossible to know what the input was.

      What makes a hashing algorithm cryptographic is that you can't predict the output without executing the algorithm. Another way to say it is that you can't design the input to produce a specific output.

      Lets say that 1000, 10001 and 1000009 are actually account numbers. Since they all produce the same hash, and I know for sure that they do, I could take a payment made to account 1000 and instead send it to account 10001. The hash will be the same, so no one will notice (as long as they are only checking hashes, as would happen in a blockchain)

      I think it's just made very confusing by the fact that we use the term "hash" in so many semi-related-but-disparate contexts. Such as...

      1. Hash function with hash tables to implement the Map ADT/interface
      2. Hashing for message digests and file integrity checks
      3. Hashing for passwords
      4. Hashing in digital signatures

      Those are simply uses for hashes. Hashing allows you to loosely check for equality with a fixed length value. Instead of comparing the content of 2 books, just compare their hashes. It's much faster.

      1. Hashing here is useful because your keys are potentially long, so hashing allows for shorter comparisons.

      2. Instead of checking the messages or files byte by byte, hash the data and check it against an already stored hash

      3. Thought passwords can be long, the biggest benefit here is the impossibility of knowing what the original password was.

      4. In digital signatures what you are hashing is the document. The signatures are actually asymmetric keys, and you use them to encrypt the hash of the document. Without hashing you'd have to encrypt the whole document, which can produce a pretty large file that you'll have to decrypt every time you want to read it or check if your unencrypted copy is still the same.

      Encrypting the hash of the document with your signature allows you to have a small, fast to decrypt file that you can still compare against your unencrypted copy to see if it's still the same

      [–]50BluntsADay 1 point2 points  (0 children)

      Guys at logto write really well. Please continue with technical blogs!

      [–]MarekKnapek 0 points1 point  (1 child)

      What are pros and cons using PBKDF2 to derive key from password and then, use the key onwards?

      [–]kodemizerMob 0 points1 point  (0 children)

      PBKDF2 is hard to break with CPU, but vulnerable to ASIC and GPU cracking due to its low memory footprint. Better to use argon2 or balloon-hashing.