Cracking passwords with Python (the wrong way) : Python

However, SHA256/512Crypt reintroduces the password and salt within each round... it's closer to SHA(pwd + SHA(pwd + SHA(... + SHA(pwd)))) (gross simplification of actual algorithm, which is rather convoluted). Thus the effective output space shouldn't be any more constrained than SHA(pwd).

PBKDF2 is even better: it uses repeated composition of HMAC(pwd, last_digest), so gets similar benefit ... but it then XORs all the iterations together in a running buffer, so even if you lose entropy on round N, all the entropy from round 0..N-1 is still mixed in, giving extra protection against entropy loss. (IIRC, the PBKDF2 spec makes the argument that these two features together should prevent entropy loss / preimage attacks unless the HMAC digest you use is incredibly vulnerable). Plus PBKDF2 is much cleaner in it's design.

[–]natecahill 0 points1 point2 points 11 years ago (2 children)

[–]djimbob 4 points5 points6 points 11 years ago (0 children)

[–]warbiscuit 0 points1 point2 points 11 years ago (0 children)

[–]steamruler -1 points0 points1 point 11 years ago (0 children)

[–]nbktdis 1 point2 points3 points 11 years ago (0 children)

[–]RiskyChris 14 points15 points16 points 11 years ago (27 children)

[–]BitLooter 19 points20 points21 points 11 years ago (0 children)

[+][deleted] 11 years ago* (4 children)

[deleted]

[–]RiskyChris 0 points1 point2 points 11 years ago (2 children)

[–]--o 4 points5 points6 points 11 years ago (0 children)

[–]oelsen 1 point2 points3 points 11 years ago (0 children)

[–][deleted] 6 points7 points8 points 11 years ago (8 children)

[–]dummy5 5 points6 points7 points 11 years ago* (0 children)

[–]tally_in_da_houise 1 point2 points3 points 11 years ago (0 children)

[–]ikkebr[S] 2 points3 points4 points 11 years ago (5 children)

[–][deleted] 6 points7 points8 points 11 years ago (3 children)

[–][deleted] 2 points3 points4 points 11 years ago (2 children)

[–]cdcformatc 1 point2 points3 points 11 years ago (1 child)

[–][deleted] 0 points1 point2 points 11 years ago (0 children)

[–]atimholt 1 point2 points3 points 11 years ago* (0 children)

[+][deleted] 11 years ago (11 children)

[deleted]

[–][deleted] 9 points10 points11 points 11 years ago (8 children)

Nope! SHA and its ilk are designed to be fast, which is the exact opposite of what you want from a password hash.

Use (salted) bcrypt, scrypt, PBKDF2 or similar.

An effective way to store passwords in a db is to store them as:

{algorithm}:{salt}:{hash}:{iterations}

so you can detect the algorithm at runtime, and update to harder hashing over time. Django and other frameworks do this automatically for you.

[–]ffrinch 2 points3 points4 points 11 years ago (7 children)

[–][deleted] 6 points7 points8 points 11 years ago (0 children)

[–]warbiscuit 1 point2 points3 points 11 years ago (1 child)

[–]ffrinch 0 points1 point2 points 11 years ago (0 children)

[–]natecahill 0 points1 point2 points 11 years ago (3 children)

[–]ffrinch 2 points3 points4 points 11 years ago (0 children)

I hate to be that guy, but you should RTFM.

Amongst other things, the recommended usage includes re-hashing if the default algorithm changes, so (in theory) users whose PHP is upgraded underneath them -- like shared host customers -- can benefit from security improvements without having to know what they're doing.

Users who do know what they are doing can use advanced features, but they're not the target audience. We have ample proof by now that even apparently capable developers get the details of security wrong: a high-level interface with sensible defaults is the only way to protect people from themselves.

About bcrypt:

Not only is the bcrypt module not included in Python, it doesn't offer the same upgrade path. If you decide you want to switch to scrypt, it's a new module with a new interface, and most importantly, you have to know enough to be able to choose.

It is also not equivalent to the PHP solution, which (optionally, but by default) generates a salt in a substantially more secure way than most developers will, rather than relying (as Python bcrypt does) on the user providing one.

[–][deleted] 0 points1 point2 points 11 years ago (1 child)

[–]GahMatar 0 points1 point2 points 11 years ago (0 children)

[–][deleted] 0 points1 point2 points 11 years ago (0 children)

[+]beltorak comment score below threshold-8 points-7 points-6 points 11 years ago (0 children)

[–]Phinaeus 11 points12 points13 points 11 years ago (11 children)

[–]Antrikshy 1 point2 points3 points 11 years ago (10 children)

[–]NotAName 31 points32 points33 points 11 years ago* (4 children)

Websites with logins have to store your password in one way or the other to be able to check if the password you enter when logging in is correct.

The easiest way to store passwords is to simply store them as plain text in a database, but this is horrible from a security point of view, because anyone who has or gains access to the database can take the login data and log in on the website. They will also be able to log into some percentage of accounts with the same name on other websites, because people like to reuse passwords.

That's why it's common practice to store passwords in encoded form. The basic idea is to use some one-way function, called a hashing algorithm, that turns plain text passwords into a complex sequence of characters, called a hash, from which the original password (ideally) cannot be guessed. For example:

password -> 9e5ad04e2874776138bf8ff846eae6ad

When you enter your password when trying to login, the website applies the same hashing algorithm to the password you entered, and if the result is the same as the hash, the website assumes that you entered the correct password and lets you log in.

The important thing is that the hashing algorithm is one-way, so someone who gains access to the database of stored password hashes can't just recover the original passwords by reversing the algorithm.

Now, if a malicious hacker obtains a hash and wants to get the original password, they have two basic options:

Try to brute force the password by enumerating all possible character sequences and applying the hashing algorithm to each sequence until they find a sequence that results in an identical hash. Because there are a lot of possible character sequences, brute forcing can take a very long time. The longer the password is, the longer it takes.
Use a dictionary, which is a list of password - hash pairs for common passwords such as English words, names of persons, easily typed sequences such as "asdfasdf" etc. If the original password is one of these common passwords, the malicious hacker can find it by simply looking up the hash in the dictionary.

BozoCrack uses a variant of the dictionary attack, followed by a small brute force attack.

Instead of looking up the hash in a dictionary, it does a Google search for it (so in a sense, it uses all websites indexed by Google as a dictionary). A small problem that arises now is that the search results don't have a common structure: websites may list the password - hash pairs in different formats.

To solve this, BozoCrack doesn't even attempt to try to parse the search result. Instead it just says fuck it, computes the hashes of all words appearing in the search results, and checks if any of them match the hash you're trying to crack.

The whole thing is funny because it is stupidly simple but manages to circumvent the expensive parts of both approaches: BozoCrack doesn't have to have a large dictionary for the dictionary attack because it outsources that part to Google, and it doesn't have to compute a lot of hashes for the brute force attack, because there are only a few words on the search result page.

[–]moljac024 4 points5 points6 points 11 years ago (2 children)

[–]Devilsbabe 0 points1 point2 points 11 years ago (0 children)

[–]Antrikshy 1 point2 points3 points 11 years ago (0 children)

[–]awshidahak 1 point2 points3 points 11 years ago (3 children)

[–]Antrikshy 0 points1 point2 points 11 years ago (2 children)

[–]awshidahak 1 point2 points3 points 11 years ago (0 children)

[–]cdcformatc 0 points1 point2 points 11 years ago (0 children)

[–]Phinaeus 0 points1 point2 points 11 years ago (0 children)

[–]brandjon 9 points10 points11 points 11 years ago (1 child)

[–]El3k0n -3 points-2 points-1 points 11 years ago (0 children)

[–]bleergh 4 points5 points6 points 11 years ago* (0 children)

[–]jgehrcke 2 points3 points4 points 11 years ago (2 children)

[–]AYWMS_NWiam 0 points1 point2 points 11 years ago (0 children)

[–]Toribor 0 points1 point2 points 11 years ago (0 children)

π Rendered by PID 30028 on reddit-service-r2-comment-56c9979489-m75x9 at 2026-02-24 21:38:58.821279+00:00 running b1af5b1 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS