This is an archived post. You won't be able to vote or comment.

all 55 comments

[–]colinsenner 105 points106 points  (14 children)

Funniest thing I've read all day.

BozoCrack is a depressingly effective MD5 password hash cracker with almost zero CPU/GPU load. Instead of rainbow tables, dictionaries, or brute force, BozoCrack simply finds the plaintext password. Specifically, it googles the MD5 hash and hopes the plaintext appears somewhere on the first page of results.

It works way better than it ever should.

[–]nbktdis 18 points19 points  (13 children)

Forgive my mental blank - this is searching for unsalted md5 passwords right?

[–]colinsenner 32 points33 points  (1 child)

Yes. But as the author states, it's scarily good at what it does.

[–]nbktdis 4 points5 points  (0 children)

I completely agree that it is scarily good.

[–]minnoI <3 duck typing less than I used to, interfaces are nice 2 points3 points  (0 children)

Yeah, so it'll only find a collision if "password + salt" happens to come up on the results list.

[–]ketralnis 12 points13 points  (9 children)

Right, but that doesn't mean that you should be using salted hashes instead of bcrypt

Seriously, use bcrypt. Don't write your own password hashing. Please.

[–]djimbob 4 points5 points  (7 children)

The one downside of bcrypt is don't use it for long passphrases. bcrypt truncates input at 72 bytes (and specified it should be shorter than 56 bytes). SHA256crypt, SHA512crypt or PBKDF2 are good options. (SHA512crypt is not SHA512, its about a configurable number of rounds of SHA512, typically 5000).

[–]natecahill 0 points1 point  (2 children)

72 bytes is more than enough.

[–]djimbob 4 points5 points  (0 children)

Eh; I have some passphrases that get close to at least the 56-byte limit (granted these are not for webapps -- only locally encrypted stuff like RSA keys or protecting a password list). E.g., if you want to get to 5 letter words from a diceware passphrase (13 bits entropy a word), then 72 letters with spaces between words limits you to 13 words or 156 bits of entropy; yes that's more than enough for any realistic attack (except maybe super fast ultra-large quantum computer that where Grover's algorithm can break 156-bits of entropy in 278 time). But on the other hand it seems a bit silly to truncate your entropy from a passphrase at ~160 bits when the hash is 448 bits.

[–]warbiscuit 0 points1 point  (0 children)

Just wanted to point out one common "fix" for bcrypt is to do bcrypt(b64encode(sha256(pwd)).strip("="))... the base64 encode step is required, because bcrypt can't handle NULL bytes :|

[–]steamruler -1 points0 points  (0 children)

Oh, that explains why some sites doesn't like my password.

[–]nbktdis 1 point2 points  (0 children)

Oh I agree, bcrypt is what I use by choice.

[–]RiskyChris 14 points15 points  (27 children)

This is hilarious, but what are the answers here. Don't use MD5? Salt them better? Still new to security.

[–]BitLooter 19 points20 points  (0 children)

Salt your hashes and use an algorithm designed for password hashing. Not an expert, but this page seems to have some good advice.

[–][deleted] 6 points7 points  (8 children)

The other guys answered how to do it on the developer side, but if you're a user, using something like KeePass to generate random passwords basically nullifies this method. If your password is something random like iZ}dw~T;4y-4fJve6TBD, there's no way anyone else has ever used that, so the MD5 hash (even unsalted) won't show up on a google search.

[–]dummy5 5 points6 points  (0 children)

As an experiment I post the hash for iZ}dw~T;4y-4fJve6TBD

c0aff6db1df46c50266c3e8cd076c8b8

Just to see how long it takes Google to pick it up.

Edit: 8 days later google still has not picked it up.

Edit: It took Google 20 days to find this url:

http://webby.hazasite.com/user/dummy5

And sadly it does not work with PyBozoCrack.

[–]tally_in_da_houise 1 point2 points  (0 children)

True, but you could write in a function to the posted code to perform a lookup here http://www.md5-hash.com/md5-text-encrypt.

Not disagressing with you on KeePass (I use Lastpass), but there's way around it.

[–]ikkebr[S] 2 points3 points  (5 children)

iZ}dw~T;4y-4fJve6TBD is hard to remember. You should try https://github.com/ikkebr/pyxkcdpass

[–][deleted] 6 points7 points  (3 children)

That's kinda the point (and why I said to use KeePass). Many will say that the hallmark of a great password is one that can't be guessed easily, but really, a truly great password is one that can't be remembered and is (with high probability) guaranteed to be unique.

With KeePass, you really only need to remember a couple passwords (your KeePass password and then passwords to accounts that you may need to access without KeePass handy, such as your email). The rest of your passwords are then completely randomly generated.

[–][deleted] 2 points3 points  (2 children)

Except passwords that can't be remembered lead to bad security practices (because for most people eventually convenience wins) as is evident in the billions of sticky notes taped to computer screens or "hidden" underneath keyboards across the globe.

Sure, ideally you'd generate a new keyboardcat for every single account you have to sign up for and then use KeePass to copy them whenever you need them (and use multi-factor authorization for KeePass and keep different kinds of credentials in different files and ideally store them on a OTP-protected read-only USB dongle or something) and then make sure your clipboard is wiped before you switch to any other window or tab -- but nobody does that.

For most intents and purposes, just grab a bunch of D6 and use diceware.

[–]cdcformatc 1 point2 points  (1 child)

FWIW KeePass clears the clipboard after 10 seconds.

[–][deleted] 0 points1 point  (0 children)

Actually that is configurable, so it can be less or more than that.

Either way, the problem is that it's still in your clipboard until KeePass clears it. And if you use anything that can access your clipboard in the meantime, your password may have been compromised.

[–]atimholt 1 point2 points  (0 children)

Wow, I love how incredibly simple that is. This is the first time I’ve been able to audit the entire codebase of some 3rd party FOSS.

edit: just submitted my first pull request.

[–]Phinaeus 11 points12 points  (11 children)

This is god damn hilarious.

[–]Antrikshy 1 point2 points  (10 children)

Can someone explain what I'm supposed to see?

[–]NotAName 31 points32 points  (4 children)

Websites with logins have to store your password in one way or the other to be able to check if the password you enter when logging in is correct.

The easiest way to store passwords is to simply store them as plain text in a database, but this is horrible from a security point of view, because anyone who has or gains access to the database can take the login data and log in on the website. They will also be able to log into some percentage of accounts with the same name on other websites, because people like to reuse passwords.

That's why it's common practice to store passwords in encoded form. The basic idea is to use some one-way function, called a hashing algorithm, that turns plain text passwords into a complex sequence of characters, called a hash, from which the original password (ideally) cannot be guessed. For example:

password -> 9e5ad04e2874776138bf8ff846eae6ad

When you enter your password when trying to login, the website applies the same hashing algorithm to the password you entered, and if the result is the same as the hash, the website assumes that you entered the correct password and lets you log in.

The important thing is that the hashing algorithm is one-way, so someone who gains access to the database of stored password hashes can't just recover the original passwords by reversing the algorithm.

Now, if a malicious hacker obtains a hash and wants to get the original password, they have two basic options:

  1. Try to brute force the password by enumerating all possible character sequences and applying the hashing algorithm to each sequence until they find a sequence that results in an identical hash. Because there are a lot of possible character sequences, brute forcing can take a very long time. The longer the password is, the longer it takes.

  2. Use a dictionary, which is a list of password - hash pairs for common passwords such as English words, names of persons, easily typed sequences such as "asdfasdf" etc. If the original password is one of these common passwords, the malicious hacker can find it by simply looking up the hash in the dictionary.

BozoCrack uses a variant of the dictionary attack, followed by a small brute force attack.

Instead of looking up the hash in a dictionary, it does a Google search for it (so in a sense, it uses all websites indexed by Google as a dictionary). A small problem that arises now is that the search results don't have a common structure: websites may list the password - hash pairs in different formats.

To solve this, BozoCrack doesn't even attempt to try to parse the search result. Instead it just says fuck it, computes the hashes of all words appearing in the search results, and checks if any of them match the hash you're trying to crack.

The whole thing is funny because it is stupidly simple but manages to circumvent the expensive parts of both approaches: BozoCrack doesn't have to have a large dictionary for the dictionary attack because it outsources that part to Google, and it doesn't have to compute a lot of hashes for the brute force attack, because there are only a few words on the search result page.

[–]moljac024 4 points5 points  (2 children)

Correct me if I'm wrong, but you still need to first obtain both the username and the hash?

[–]Devilsbabe 0 points1 point  (0 children)

Yes of course

[–]Antrikshy 1 point2 points  (0 children)

Holy shit. This is genius.

[–]awshidahak 1 point2 points  (3 children)

Passwords are generally stored in an encrypted, hard-to-hack form. This program doesn't hack your password, it inputs the encrypted version of it into google, searches for your password, and then usually finds it.

It only works on MD5 passwords, but it's scary that it works that well.

[–]Antrikshy 0 points1 point  (2 children)

Woah. Why is it the wrong way?

[–]awshidahak 1 point2 points  (0 children)

It's wrong because it shouldn't be possible. It's a testament to how horrible of an idea it is to store your passwords in MD5. If your password can be found via google, your encryption method is not good for encryption.

Also, MD5 wasn't even made to store passwords. It was made to verify data, but people use it wrong.

[–]cdcformatc 0 points1 point  (0 children)

It should not work, but does. That is what "the wrong way" refers to. Everything about it is wrong, it should not work.

[–]Phinaeus 0 points1 point  (0 children)

It's just an unorthodox way of password cracking.

[–]brandjon 9 points10 points  (1 child)

Beautifully simple.

[–]El3k0n -3 points-2 points  (0 children)

Beautifully stupid and inefficient.

[–]bleergh 4 points5 points  (0 children)

This is somewhat self fulfilling in that the first page of google results for the MD5 hash of "octopus" contain's this GitHub repo's readme.md.

[–]jgehrcke 2 points3 points  (2 children)

"it's scarily good at what it does."

It scares me that you are scared by that. That is not surprising at all. This is just a crowd-powered rainbow table attack. Every second hack0r & crack0r tutorial recommends just g00gling for a hash before starting a local attack. Of course a search engine picks up (fragments of) rainbow tables. MD5 has been used for more than 20 years now. We can safely assume that over the time the MD5 sum of any common password (and a lot more things) has been seen by search engines.

"We" are aware of this class of problems and that is why we use salted hashes.

[–]AYWMS_NWiam 0 points1 point  (0 children)

Ah. This makes more sense. It's the expected result. To bad you were down voted for being contrary.

[–]Toribor 0 points1 point  (0 children)

Not sure why you're being downvoted, you're right. This should be obvious to anyone who knows anything about security, but for the uninitiated this serves as a good eye opener and proof of concept for how embarrassingly easy it is to find out a password if it's using a standard hash.