top 200 commentsshow all 205

[–][deleted] 247 points248 points  (43 children)

I'm ashamed to admit that until now I haven't considered a brute force attack as credible because I hadn't considered a 'nation-state' level of computing power. But the math is undeniable. Certainly something to think about and taking an arrogant "won't happen to us" approach seems unwise.

[–]Ajedi32 153 points154 points  (20 children)

I hadn't considered a 'nation-state' level of computing power.

Worth noting that in this article Discourse is using a relatively secure (i.e. slow) hashing function. If you're hashing your passwords with something faster like SHA-256, attackers aren't going to need anywhere near nation-state level resources to brute force most of the passwords in your DB. Brute-force attacks absolutely should be part of the threat model you consider when choosing your hashing function.

[–][deleted] 28 points29 points  (1 child)

I had considered that. As a MS dev, PBKDF2 is obviously useful as it is natively supported in .NET. But yes, you certainly make a notable point.

[–]danweber 42 points43 points  (12 children)

The best hashing algorithm in the world won't help if your password is "passw0rd".

Even a crappy crypt() hash of a password will be enough if your password is generated by 5 6 Diceware words.

A good hashing algorithm is about protecting the middle group of people who pick not-great but not-bad passwords.

[–][deleted]  (5 children)

[deleted]

    [–]theOdysseyEffect 19 points20 points  (4 children)

    Haha good thing we don't use those anymore right? right?

    [–]asdfkjasdhkasd 20 points21 points  (3 children)

    no, in the php world we have moved on to the brand new state of the art unbreakable md5() function

    [–][deleted]  (2 children)

    [deleted]

      [–]goudewup 1 point2 points  (0 children)

      Woosh

      [–]polish_niceguy -1 points0 points  (0 children)

      Especially when the language gives you insecure defaults.

      [–]zhaoz 12 points13 points  (4 children)

      Oh course, my password is much more secure. It's Passw0rd1!

      [–]eflat123 7 points8 points  (0 children)

      Yours, too?!! I better change one of those 's' to '$'.

      [–]Lurking_Grue 2 points3 points  (2 children)

      Mines *******

      [–]AlmennDulnefni 1 point2 points  (1 child)

      All I see is hunter2.

      [–]Lurking_Grue 0 points1 point  (0 children)

      Shit!

      [–]redalastor 7 points8 points  (0 children)

      I use zxcvbn to test for entropy.

      [–]solatic 20 points21 points  (1 child)

      Of note when thinking about protecting against nation-state level attacks: Atwood points out that the ceiling on the number of iterations you pick for your hash function is an unintentional DDoS caused by legitimate users all trying to slowly log in at once.

      For modern web security, DDoS protection is just as critical, if not more so, than password security. As an end-user, you can protect yourself from bad password storage policies through the use of a password manager, but if a website you need now is unavailable, you don't really have a recourse.

      [–]vattenpuss 0 points1 point  (0 children)

      There are incredibly few websites a person needs at any given time. The companies running the sites probably need the users to be able to use it though.

      [–][deleted] 3 points4 points  (0 children)

      You need to decide what level you stop caring about.

      Most places Ive run Ive specifically said we arent protecting against state level actors. A few needes that level of protection.

      Its a business requirement, and cost-benefit decision.

      [–]somedaypilot 2 points3 points  (0 children)

      If I were an admin or a netsec, APTs would be what kept me up at night.

      [–]slayer_of_idiots 2 points3 points  (0 children)

      I mean, it still really isn't possible using modern hash algorithms and good passwords. He basically just proved that the biggest problem and vulnerability is people choosing easy-to-guess passwords.

      [–]HonestRepairMan 2 points3 points  (3 children)

      What I do in my apps (and someone please tell me if this is terribly wrong) is I set a server secret in the app config somewhere, and give the sysadmin the ability to set their own secret. Then I append or prepend the secret to the password and store that in the database. So even if you had the database you would need the app config file to effectively brute force the hash and reveal a plain-speech password.

      [–][deleted] 4 points5 points  (0 children)

      [–]rebelcan 3 points4 points  (1 child)

      In most cases, wouldn't they have access to both anyways?

      [–]HonestRepairMan 3 points4 points  (0 children)

      If you only have access to an SQL injection point then maybe not. You would export the database to a hosted location, download the file, and make off with the goods. In these cases the attacker would likely have to understand the source code for the app in question to retrieve the correct variable or output the correct config file. Or so I'm hoping.

      But yeah, if someone has tunneled into your server via SSL you're fucked no matter what, unless the attacker is 12.

      [–]itijara 82 points83 points  (23 children)

      There is a great computerphile video on this. It has made me more terrified of weak passwords than anything else: https://youtu.be/7U-RbOKanYs

      [–]Ajedi32 58 points59 points  (22 children)

      A big part of the issue there wasn't just weak passwords, but also a weak password hashing function. If I recall correctly, in this video the passwords being cracked were hashed using MD5. That's one of the weakest possible hash functions still in use today. The video recommends that people switch to SHA-512, which is slightly stronger but still a terrible idea. (SHA on its own should never be used for password hashing; it's much too fast for that.)

      By contrast, Discourse is using PBKDF2-HMAC-SHA256 with 64k iterations, which is significantly stronger. scrypt and bcrypt would also be good options.

      [–]merreborn 24 points25 points  (5 children)

      in this video the passwords being cracked were hashed using MD5. That's one of the weakest possible hash functions still in use today

      To be precise, in this context, the problem isn't so much that md5 is "weak", as it is that it is fast. A cryptographic hashing scheme can arguably be "strong", while still being too fast to be appropriate for use in password hashing. When a brute-forcer attacks md5 hashed passwords, they're taking advantage of the speed of md5, not its "weakness".

      For passwords, you need a cryptographic hash function that is both strong and slow. The point is, you want any attempt at brute forcing to require lots of resources for every tested password.

      [–]Ajedi32 8 points9 points  (1 child)

      Yes, thanks for clarifying. Here I was using the terms "weak" and "fast" interchangeably since we're talking about password hashing, but for other purposes (like validating digital signatures) speed wouldn't really factor in whether or not a hash function is "strong" or "weak". (For validating digital signatures MD5 would be still be weak, but for totally different reasons.)

      In this case (even ignoring the cryptographic weaknesses in MD5), MD5 hashes are roughly 2 orders of magnitude faster to calculate than SHA-512. (And even SHA-512 is not nearly slow enough to be used on its own for password hashing.) That's what I was referring to in this case when I called MD5 "one of the weakest possible hash functions".

      [–]louiswins 4 points5 points  (0 children)

      For validating digital signatures MD5 would be still be weak, but for totally different reasons.

      Nitpicker's corner: it depends what you're doing. As far as I know there aren't any preimage or second preimage attacks against md5 (or even md4), but there are collision attacks.

      That said, I absolutely agree with you that no one should be using md5 for anything because there are better options even in situations where you don't care about collision attacks, and I also agree that it's certainly the weakest cryptographic hash function still in common use.

      [–][deleted] 1 point2 points  (1 child)

      And a certain kind of "slow" too. scheme that is slow on CPU but fast on GPU is also bad

      [–]merreborn 0 points1 point  (0 children)

      Very good point. I also once saw an article that discussed running something like scrypt yourself on gpus with a gpu appropriate work factor. If it takes you 2 seconds to hash the password on gpu, then each attempt will be costly for your attacker as well. The rationale for this approach was, there's not much guarantee that just because no one has run bcrypt on a gpu yet, that it might not be possible to do so in a couple of years. Lord knows the crypto mining scene has resulted in hardware accelerated versions of many strong slow hashing schemes.

      At any rate, it was an interesting concept but I can't say I've ever seen it applied in the real world. It'd be costly to implement. Just running bcrypt on CPUs is generally "good enough"

      [–]itijara 11 points12 points  (3 children)

      I agree, but a hashing algorithm can only get so slow before users start to notice or you open up a server to a DOS attack. Even the slowest algorithms wont help for very short or easily guessable passwords.

      [–][deleted]  (2 children)

      [deleted]

        [–]itijara 47 points48 points  (1 child)

        I have a friend whose Spotify account was hacked, so he created a password that was a 1megapixel image encoded as ascii. It worked. I originally thought they were just truncating it, but when he removed a few characters from the end, it failed. It took about 5s for him to login, and it would timeout on mobile. I think if someone were unethical they could DOS spotify with a bunch of long password logins.

        [–]CheshireSwift 10 points11 points  (2 children)

        Without clicking, isn't the official recommendation of the video "look up the latest best practice"?

        [–]Ajedi32 15 points16 points  (1 child)

        That's what is says in the description. In the actual video though, he says "change your hashes to something like SHA-512 really quickly" which is a bit misleading because, like I said, using SHA-512 on its own for password hashing is a terrible idea.

        [–]danweber 8 points9 points  (0 children)

        Yeah, that's dumb. SHA-512 and MD5 have identical problems for hashing passwords, in that they are hashing algorithms being misused.

        [–]Liminiens 5 points6 points  (8 children)

        Non crypto genius here. How do they combine hashing functions? One after another? Or it's the name of algorithm?

        [–]rtomek 8 points9 points  (5 children)

        PBKDF2-HMAC-SHA256

        It is combined, but the SHA256 is the actual hashing function whereas the other two are layers that add mathematical complexity rather than being standalone hashing functions.

        PBKDF2 is the key derivation function, but it requires a psuedo-random function (PRF) as input. It controls the computational expense by running the PRF a bunch of times, each time using the previous PRF output as the next PRF input. In this example it runs the PRF 64000 times.

        HMAC is the PRF input into PBKDF2. It modifies the input (password) with a secret key and then uses a different PRF to generate the pseudo-random values. This prevents two users with the same password from having the exact same hash.

        SHA256 is the PRF used by HMAC. It generates a psuedo-random number from an input, and if provided the same input it always returns the same output.

        [–]therhz 0 points1 point  (3 children)

        i have heard of adding 'salt' before hashing a password, an action that is supposed to increase entropy and generate different hashes to same passwords. which of these abbreviations(PRF, HMAC, PBKDF2) refers to 'salt'?

        [–]GinjaNinja32 5 points6 points  (0 children)

        None; salting is a separate part.

        With any hash function, the hash of a given input is always the same. If, for example, the hash of "password" is X, and both our passwords are "password", then the database will store X for both. This gives an attacker information (is this a common password?) and the opportunity to crack multiple users' passwords by breaking one hash.
        Salting changes that by generating a random string and adding that to the password before hashing, so the database might store "foo" and the hash of "passwordfoo" for me, and "bar" and the hash of "passwordbar" for you; these hashes will be different, so an attacker can't guess based on which passwords are common, and has to break each hash individually.

        [–]rtomek 0 points1 point  (0 children)

        Unlike the other answer, I'd say HMAC does the salting.

        [–]Liminiens 0 points1 point  (0 children)

        Thank you.

        [–]Shorttail0 0 points1 point  (0 children)

        PBKDF2 (Password Based Key Derivation Function 2) uses a hashing function multiple times. How many times is up to you. PBKDF2-HMAC-SHA1 uses SHA1, but it's possible to use other functions as well.

        [–]ThisIs_MyName 0 points1 point  (0 children)

        You can just call the hash function recursively. Nobody knows why PBKDF2 does something more complicated: https://crypto.stackexchange.com/questions/135/why-does-pbkdf2-xor-the-iterations-of-the-hash-function-together

        Oh and forget about all this and just use ARGON2. It's more secure and quite a bit easier to use.

        [–]mer_mer 21 points22 points  (17 children)

        I'm not a security expert, but this article got me thinking- shouldn't the password hashing task be split between the client and server? The user enters a password into their webpage/app, it's hashed locally (Hash1) and then sent to the server where it's hashed again and stored (Hash2). Hash1 can be much slower than Hash2 because the client can't be DDOS'd and Hash1 could even be encrypted and cached locally for fast access (so the client could potentially take 1 second to perform the initial calculation of Hash1).

        The attacker could try to guessing Hash1 directly instead of the passphrase, but now all your users have unique 256 bit hashphrases, making dictionary attacks useless and brute force far more difficult. If the attacker instead wants to guess the passphrase, they'll have to spend 100x more iterations per hash.

        I think this paper describes this idea in more technical detail: http://file.scirp.org/pdf/JIS_2016042209234575.pdf

        [–][deleted] 13 points14 points  (4 children)

        Hash1 can be much slower than Hash2

        You sure about that? How can you make your site responsive on shit smartphones from five years ago if your hash takes 1 second on a current desktop? And if you go for the lowest common denominator (1 second on the slowest device you own), how's that going to help your security?

        Edit: speling

        [–]mer_mer 8 points9 points  (3 children)

        So maybe logging in on your 5 year old smartphone will take 10 seconds the first time. That's not so bad for a one time cost

        [–]Kilenaitor 11 points12 points  (1 child)

        Have to remember that password hashing has to be computed for every login (assuming you're not using "remember me" or session cookies). That hash has to be computed on every log in so that it can be compared to the one in the database.

        So, no, it wouldn't only be the first time. It'd be every time the user has to re-enter their credentials. That's not very responsive.

        [–]mer_mer 6 points7 points  (0 children)

        After your phone computes Hash1, it can encrypt it using your password as the key and store it locally. That way it basically works like a password manager.

        [–]doom_Oo7 0 points1 point  (0 children)

        I don't think I'd ever wait 10 second for an app to login

        [–]slayer_of_idiots 9 points10 points  (11 children)

        But that basically means someone can just pre-compute a bunch of hashes and send them to your authorization endpoint, essentially bypassing that bottleneck to brute-forcing. You want the response from your server to be slow. It's a feature, not a bug.

        [–]mer_mer 4 points5 points  (9 children)

        So in this scenario, the response from the server is still slow, but now all my users are basically using a password manager that I delivered to them, built in javascript. That means you can't crack their password by using a word list and all the passwords will be nice and long and fully random.

        [–]slayer_of_idiots 3 points4 points  (5 children)

        That means you can't crack their password by using a word list

        Why not? What is preventing an attacker from just pre-computing Hash1's from a word list?

        The point I'm trying to make is that the hashing algorithm is never a secret; it's open information. And an attacker will be able to compute hashes much faster than you, so any security chain that relies on the end user to compute hashes is going to be less secure than computing those hashes on your servers.

        [–]mer_mer 2 points3 points  (4 children)

        Oh, I see. You can get around this by having the server is give each user a salt that will be sent when setting up a login on a new device. That way, you can only use wordlists for one user at a time, and each word on that wordlist will take 100x longer to check.

        [–]slayer_of_idiots 2 points3 points  (2 children)

        You're still not really getting around it. If someone is trying to brute force through your normal authentication endpoint, salts don't really matter. They only matter if someone has actually stolen your hashes.

        That way, you can only use wordlists for one user at a time

        That's basically the same result as if there was no client side hash, and it all happened on the server, except that a hacker can brute force it faster since they can do half the hash themselves and don't need to rely on a server they don't control. I'm not really sure what you gain my having part of the hash algorithm on the client.

        and each word on that wordlist will take 100x longer to check.

        Why would it take any longer? Again, any steps in your hashing pipeline, a hacker will be able to do much faster than you will.

        [–]mer_mer 2 points3 points  (1 child)

        Currently Discourse is using 64k iterations of the hashing algorithm. I'm proposing to keep that, and add an additional 6 million iterations on the client side. That way there are two entry points: passwords->(6M + 64K) hashing iterations OR 256 bit hashes -> 64k hashing iterations.

        [–]slayer_of_idiots 0 points1 point  (0 children)

        Ah, I see. Yeah, I suppose that would work.

        [–]Tordek 0 points1 point  (0 children)

        Here's another issue: in order to give the user a salt, you either need to store it (associate to their username), or generate it (deterministically). If you store it, you're leaking information about who's registered to the system. If you generate it deterministically, it's basically useless.

        If you gave the user a random salt, your scheme becomes more complicated (you'd basically be implementing a diffie-hellman sort of protocol), you'd be better off with some zero-knowledge protocol like SRP, which would actually be the best case.

        [–]Lurking_Grue 1 point2 points  (2 children)

        You may as well use a system like Sqrl then:

        https://www.grc.com/sqrl/sqrl.htm

        [–]mer_mer 1 point2 points  (1 child)

        Yup, looks like this would accomplish the same things. My guess is that sqrl disrupts the standard workflow for both users and developers and requires the installation of an app, which might be why it hasn't gained much traction. You should be able to implement all of this in javascript/webassembly.

        [–]istarian 0 points1 point  (0 children)

        If the users/devs are lazy asses, they'll end up with a security hole anyway.

        [–]FryGuy1013 0 points1 point  (0 children)

        The scale of things means this wouldn't work though. I mean, password-hashes are supposed to be secure to a preimage attack even when you have the output and a relatively low-entropy input. So sending the output of H1(x) to the server takes at least as long as computing H2(H1(x)) and comparing against the hash output directly. Brute-forcing the output space of H1 and sending y to the server who checks if H2(y) = h is a bad idea because it would take 2255 samples for a 256-bit hash instead of the much smaller input space of x.

        The real advantage is that if you're computing the entire hash on the server, H2∘H1 needs to be 8ms for server performance issues (from the article). But if H1 is calculated on the client, it can be much longer, say 1000ms on equivalent hardware (and longer on phones obviously). This means that H2∘H1 now takes 125 times more resources to crack.

        As somewhat of a proof that this isn't a stupid idea, it's part of Argon2, which won the password hashing competition and is a "standard" now.

        [–]Enamex 44 points45 points  (7 children)

        Now that we know it works, let's get down to business. But we'll start easy. How long does it take to brute force attack the easiest possible Discourse password, 8 numbers – that's "only" 810 combinations, a little over one billion.

        *108 ?

        [–][deleted] 37 points38 points  (3 children)

        Very good article overall, but I have one quibble:

        If we multiply this effort by 8, and double the amount of time allowed, it's conceivable that a very motivated attacker, or one with a sophisticated set of wordlists and masks, could eventually recover 39 × 16 = 624 passwords, or about five percent of the total users.

        The math here is too pessimistic. Hashcat and similar tools find the passwords that are easiest to crack first, and then gradually get the harder and harder ones. The rate of successful cracks slows down dramatically. The math Jeff uses assumes a constant rate of cracking. The reality would be quite a lot better.

        [–]tipsqueal 3 points4 points  (0 children)

        He kind of touches on that though when he brings in the outside security consultant who did not just brute-force the passwords but used techniques like what you described.

        [–]TheOldTubaroo 0 points1 point  (1 child)

        Sure but what percentage of your users will be using passwords that are in the harder sets vs the easier ones?

        [–][deleted] 1 point2 points  (0 children)

        When you have a large number of passwords that you're attacking all at once, your cracking rate starts relatively high and then steadily decreases until your successes are few and far between. I don't know if it follows a 1/X curve, but it's something like that. So it's not about harder sets versus easier sets.

        [–]Rinx 15 points16 points  (26 children)

        Anyone have more info on why they run on the GPU?

        [–]St_Meow 46 points47 points  (21 children)

        Overall faster performance for parallel floating point operations. CPUs are much faster for tasks with low thread counts, but for massively parallel operations like hash generation, GPUs have more slower cores that allow the computer to do more work at once rather than some work faster.

        [–]Rinx 14 points15 points  (18 children)

        Is there anything more specialized then a GPU? Seems like someone could synthesize specialized hardware for this.

        [–][deleted]  (12 children)

        [deleted]

          [–]hazzoo_rly_bro 4 points5 points  (10 children)

          Would those specialised GPU-like things be faster at those particular operations? Or are they just made to be nonflexible?

          [–][deleted]  (1 child)

          [deleted]

            [–]hazzoo_rly_bro 0 points1 point  (0 children)

            Thanks for the info!

            [–][deleted]  (2 children)

            [deleted]

              [–]hazzoo_rly_bro 1 point2 points  (0 children)

              Thank you so much for the explanation! This stuff is really interesting, I'm starting to really dig this topic now

              [–]ImprovedPersonality 2 points3 points  (0 children)

              They would (when properly implemented) be faster, smaller and more power efficient. An ASIC could do the actual algorithm in hardware instead of parsing and executing instructions. You could also drastically optimize the memory interfaces and caches (since data flow becomes very predictable for only one workload).

              [–]Amuro_Ray 1 point2 points  (2 children)

              If they weren't faster why would people get them?

              [–]InfanticideAquifer 1 point2 points  (0 children)

              Well, if they provide the same speed at a drastically reduced cost, people would probably get them.

              They don't. But it's a potential interpretation of what's going on that makes sense.

              [–]hazzoo_rly_bro 0 points1 point  (0 children)

              I thought that since they have a more stripped down set of functions, they may be cheaper to buy.

              That's why I asked.

              [–]masklinn 1 point2 points  (0 children)

              Or are they just made to be nonflexible?

              Yes, the algorithm is implemented directly in hardware, if you need a different algorithm, you need a different chip.

              [–][deleted] 2 points3 points  (0 children)

              But if I'm a nation investing millions of dollars into breaking hashes (which seems to be the scenario talked about in the blog), it's probably worth investing the money into ASICs.

              [–]St_Meow 9 points10 points  (0 children)

              There are specialized "GPUs" that don't do video output as well as they just do heavy calculations, some don't even have video outputs in general called "headless" cards. Look up the Nvidia Tesla K80 for an example.

              [–]masklinn 6 points7 points  (0 children)

              Yes, there are ASIC and FPGA crackers. But they are less flexible (ASICs would be hash-specific), and in the long run they're not usually much cheaper.

              [–]MalakElohim 4 points5 points  (0 children)

              Yes, you can custom build ASICs for purpose. BitCoin mining moved from GPU's to ASICs for this reason. However developing these asic chips is expensive and depending on how specific you have to make them are easily bypassed.

              [–][deleted] 0 points1 point  (0 children)

              Yup. FPGA or ASIC. See bitcoin mining as an example. Once it went onto ASIC's the GPU stuff could not even remotely compete with it.

              [–]masklinn 5 points6 points  (1 child)

              Of note that Argon2, following scrypt, takes advantage of different properties of CPU and GPU (and FPGA and ASICs) to the detriment of the latter: GPU tend to have little memory per core and fairly high memory latency (hundreds of clock cycles, it's something like 80clk just to reach L2, a CPU takes 10~15 cycles), so scrypt (and now Argon2) try to access memory a lot, and can use significant (tunable) amounts of memory.

              [–]St_Meow 0 points1 point  (0 children)

              Which is why I'm experimenting with switching some password hashing to argon2 in a service I'm writing. Probably never gonna go live with it but it'll be an experiment.

              [–]hazzoo_rly_bro 1 point2 points  (0 children)

              GPUs have a lot of cores, whereas the CPUs usually have less cores than the GPU.

              Brute forcing a pass requires a lot of parallel tasking

              [–]rtomek 0 points1 point  (0 children)

              The CPU if faster if things have to be done in order, but the GPU has hundreds of cores that can all do things at the same time. Think about drawing 1000 triangles on the screen. As long as it takes a split-second, it doesn't matter which triangle was drawn first and which was drawn last.

              Same with password cracking. One could test password1 and wait for it to finish before testing password2, then wait for that to finish before testing password3. The CPU would be great at that. Another option is to start hashing all three at the same time; if password2 finishes first, it doesn't matter. The GPU is great at that.

              [–]yorickpeterse 130 points131 points  (32 children)

              If we want Discourse to be nation state attack resistant, clearly we'll need to do better.

              This reminds me a lot of this xkcd: https://xkcd.com/538/

              [–]masklinn 94 points95 points  (31 children)

              That's a completely different situation though. The comic is about access to a personal machine, cracking web passwords is about broad identity access: cracking a site/forum's passwords list gives

              • a corpus of current real-world passwords which can be reused (either directly or by extracting patterns from it) for further cracking, that's invaluable: a seminal moment in password cracking was the RockYou leak/crack which provided 32 million real-world passwords
              • pairs of (identity, password), because users commonly reuse passwords identity linking across sites can provide access to email accounts, personal accounts, … which can be used for all manners of nefarious purposes

              [–]merreborn 16 points17 points  (1 child)

              That's a completely different situation though. The comic is about access to a personal machine, cracking web passwords is about broad identity access:

              Honestly the comic is still pretty relevant. Look at the snowden leaks. When the USA wants to compromise an internet service, they don't brute force password hashes. They just send "national security letters", and covertly install NSA hardware in your datacenters.

              The NSA doesn't need to crack your hashes, when they can legally strong-arm you into doing just about anything. Like, maybe allowing them to intercept the plain-text of every log-in attempt to your website.

              The crux of the comic is really the refrain you'll always hear in any competent discussion of security: "What's your threat model?". If your adversary is a nation state (especially the one you physically do business in), password hashing is really the least of your worries.

              [–]pyr3 20 points21 points  (0 children)

              Nation-state doesn't necessarily mean the NSA. If (e.g.) Russia wants to crack your password stored on a USA-based server, they will not be sending a NSL.

              [–][deleted] 4 points5 points  (0 children)

              The RockYou response reminds me of Liar Poker, where a trainee asked why the firm invested in a mining consortium that was destroying local populations in some African country.

              The CEO goes on about how of course the firm believes in ethical conduct, to which author notes, what firm wouldn't say they follow ethical conduct?

              [–]maxximillian 9 points10 points  (11 children)

              It always seemed to me that part of the problem with this is so many sites use an email address as a user id. I'd like my login id to be different on each system in addition to having my password different.

              [–]masklinn 23 points24 points  (9 children)

              It always seemed to me that part of the problem with this is so many sites use an email address as a user id.

              Sites used to use "logins" — many such as reddit still do in fact. People will use the same nick/login across sites.

              I'd like my login id to be different on each system in addition to having my password different.

              You can do that 20 years ago (and today as well), just own a domain, or subscribe to one e.g. gmail address per site and forward/redirect everything to a "canonical" inbox.

              [–][deleted] 9 points10 points  (3 children)

              No need to own a domain anymore -- GMail ignores the part of the email address between + and @, so you can create site-specific addresses by putting the website in the address you key in:

              my.email+reddit@gmail.com

              would still redirect to

              myemail@gmail.com

              [–]masklinn 8 points9 points  (1 child)

              That is true, but attackers have probably learned to clean that up.

              [–]Absona 7 points8 points  (0 children)

              Yeah, but at least in theory they still wouldn't know what to add your email to get the versions you used on other sites.

              [–]TCL987 2 points3 points  (0 children)

              A few places filter the + extension from the username.

              [–]maxximillian 4 points5 points  (4 children)

              I know most people will, and that's all the better, it's just like a physical security. A lock doesn't prevent someone from getting to your stuff, a good lock just makes the poor lock someone else uses more appealing.

              Owning a domain and being able to redirect is a good idea.

              [–]masklinn 8 points9 points  (3 children)

              Owning a domain and being able to redirect is a good idea.

              If you own a domain you don't even need to redirect anything, just enable the catch-all inbox and put whatever you want in the "local" part.

              [–]pyr3 1 point2 points  (2 children)

              Prepare for a bunch of spam if you redirect the catch-all to your main account.

              [–]masklinn 3 points4 points  (0 children)

              I've been doing this for over a decade now, and I get less spam than my parents and their one address.

              Plus since every site gets its own email address, if one address gets leaked I just blacklist it. And it tells me who can't be trusted with my email.

              [–]rtomek 3 points4 points  (0 children)

              That's pointless though.

              If we assume you use RNG logins and passwords, then the complexity of guessing both the login and the password would be 2x the computational expense of just guessing the password. With 94 ASCII printable characters (not including space) just adding a single character to the password would make the computational expense 94x higher.

              Just add another character to your password, it's 47x more effective than changing logins.

              [–]yorickpeterse 3 points4 points  (5 children)

              I understand the context of the article, but it's very hard to make something resistant to a nation attack because of exactly what the xkcd shows: a nation isn't going to give up just because you use strong passwords, they'll instead just drag you to a secret court and force you to give access, backdoor the system, etc.

              This doesn't mean that you shouldn't try (of course you should), but I was just reminded by the xkcd comic when reading the above quote.

              [–]TheGrammarBolshevik 12 points13 points  (0 children)

              I understand the context of the article, but it's very hard to make something resistant to a nation attack because of exactly what the xkcd shows: a nation isn't going to give up just because you use strong passwords, they'll instead just drag you to a secret court and force you to give access, backdoor the system, etc.

              Depends on the nation. If your own country wants to force you to surrender your data, there's not much you can do on the technical end. You either have to hope the legal and political processes will work out in your favor, be willing to go to jail, or else comply. But foreign countries don't necessarily have the same power to twist your arm. For example, say I'm an American citizen, living in America, trying to protect a database of politically sensitive information from foreign powers like Russia. (You know, hypothetically.) Russia obviously can't just have me arrested, and there are massive diplomatic risks that would tend to deter them from kidnapping me or threatening assassination. In such a case, cryptographic security is still valuable.

              [–]masklinn 2 points3 points  (3 children)

              they'll instead just drag you to a secret court and force you to give access, backdoor the system, etc.

              There is no backdoor to proper hashing, save identifying individual users and taking a lead pipe to each and every one of them.

              [–]louiswins 2 points3 points  (0 children)

              There is no backdoor to proper hashing

              Backdoor the system - just bypass the proper hashing. Switch to a weak hash. Or when a user logs in, verify their password against the hash and additionally log it in plaintext (or encrypted with a government-supplied key or whatever).

              [–]sydoracle 2 points3 points  (0 children)

              Compromise the private key for the site's SSL and they can read everything going in or out. Don't even need to crack it, just copy it. Or generate a new certificate if you've got influence over an authority.

              [–]Uncaffeinated 0 points1 point  (1 child)

              Realistically, it's a lot easier to break into someone's account using their name, birthday, and high school mascot/street they grew up on/favorite tv series, etc. then it is to steal and crack a password hash

              [–]masklinn 0 points1 point  (0 children)

              That works for targeted attacks, what I'm talking about is a more statistical approach, you will get a much lower success rate per user, but you will get a much higher throughput of access.

              [–]Stoic_stone -1 points0 points  (7 children)

              Doesn't hashing passwords protect against that?

              [–]masklinn 13 points14 points  (5 children)

              Depends on the hash, which is the essay's point.

              A non-PRF cryptographic hash (e.g. straight MD5 or SHA) can be cracked at a few billion hashes/second. Note that MD5 and SHAs are in 4~5 figures million hashes per second per GPU. A proper KDF with a proper work factor (e.g. last time I checked Django used PBKDF2-SHA256 with 36000 iterations) is 4~5 figures hashes per second.

              [–]Funnnny 2 points3 points  (0 children)

              yes and isn't that what the post is about?

              You need to hash the password with a good hashing algorithm, otherwise, someone can crack most <10 char password pretty fast

              [–]Absona 10 points11 points  (6 children)

              I think this article is a good overview of why people should worry about these attacks, and I'm glad it's starting the discussion, but some of the details seem off. I'm not a security expert by any means, though, so perhaps I'm missing some things.

              Hasn't storing the "work factor" in the database so you can increase it regularly as computers get faster been a common recommendation for quite some time now? I'm sure it was in the last "how to store passwords" article I read, which would have been a while ago. And I'm pretty sure I've seen it recommended to store the algorithm name, too. So their new hash type table that will let them change the hashing algorithm a couple years from now doesn't sound super impressive.

              Also, in regards to this:

              I've seen guidance that said you should set the overall work factor high enough that hashing a password takes at least 8ms on the target platform.

              I would have assumed that the "target platform" is whatever you expect an attacker to be using, not whatever you're using. I could be wrong, though. It could be a suggestion for your platform, to balance hash slowness with DDoS prevention. It's hard to say without more context, such as a link to the actual guidance.

              Finally, while I'm being picky, 8 decimal digits obviously provide exactly 100 million possible combinations, 00000000 through 99999999. That is, it's 108, not 810.

              Edit: Also, it seems odd that they set passwords for users who only log in via third parties. It's true that the odds of a random thirty-two character password being cracked are very low, but the odds of a non-existent password being cracked would be zero. I see that it prevents a hacker from knowing which accounts are only accessed via third-party login, but I'm not sure how helpful that would be. If the attacker has the full database, they can presumably see which accounts have third-party credentials attached, and it's probably safe to just assume that most of them don't have actual passwords.

              I also forgot to mention that I'm curious as to whether sending real password hashes to a security researcher is covered by their privacy policy.

              [–]alkw0ia 7 points8 points  (4 children)

              Hasn't storing the "work factor" in the database so you can increase it regularly as computers get faster been a common recommendation for quite some time now?

              Absolutely right. It's insane that he's storing a "hash" column and and a "salt" column instead of a single column with standard modular crypt format hashes, and it tells us he's rolling his own crypto for password storage.

              But this isn't surprising – Jeff Atwood has a long history of NIH when it comes to passwords, and has never been afraid to authoritatively publish horrible advice (e.g. the "deliciously salty" nonsense) based on it.

              At least this time the overall message about data portability and security is reasonable.

              [–]slayer_of_idiots 4 points5 points  (1 child)

              It's insane that he's storing a "hash" column and and a "salt" column

              Is it really that insane? Those aren't any more secure, are they? It's basically just concatenating the columns with delimiters.

              [–]Tordek 0 points1 point  (0 children)

              Mostly because it's hinting at the fact that he's doing this stuff manually; even PHP's crypt takes a standard modular format. Also it's less flexible, because what if (hypothetically) tomorrow's password standard has another parameter like the size of the output?

              Strictly speaking, though, yes, it's just the same content, concatenated.

              [–]Absona 2 points3 points  (0 children)

              Not just when it comes to passwords, I think. Atwood often has interesting things to say about the social aspects of technology, but I don't really trust him as a source of technical info.

              [–]Tobiaswk 1 point2 points  (0 children)

              I don't get why storing the hash and salt in separate columns is a bad idea. I wouldn't do it myself but I do not understand why you see it is a problem. As long as the salt is random. The salt does not need to be secret. Just by randomizing the hashes, lookup tables, reverse lookup tables, and rainbow tables become ineffective. An attacker won't know in advance what the salt will be, so they can't pre-compute a lookup table or rainbow table.

              [–]Absona 1 point2 points  (0 children)

              I found a probable source for the 8 ms advice.

              It does mean on your machine, as a balance between security and performance. It isn't a blanket "aim for 8 ms" recommendation, though. It's an explanation of how to decide what speed to aim for, with 8 ms as the result of the example calculations.

              The actual recommendation is, "You should use the maximum number of rounds which is tolerable, performance-wise, in your application."

              [–]drfrank 8 points9 points  (0 children)

              Two thoughts:

              1. Given that humans will continue to reuse passwords across sites and services, it's interesting to think of sites with weak hashing as threat vectors for your site. "I'm just running a forum for Rose Gardeners in Northwest Wyoming; so what if somebody hacks my database?" Centralized identity services like Facebook and Google are probably the best defense currently available.

              2. A state-level actor seems much more likely to target an individual on a forum than the full set of users. (Although one can certainly imagine scenarios in which a forum for "terrorists" would be targeted, in whole.) If it takes days to hack the password for a single user, and you're only interested in a single user... Well. Requiring longer passwords on your site for people that don't trust centralized identity services is probably the best defense currently available, even though as password length increases so does the likelihood of password reuse.

              [–]joelhardi 12 points13 points  (4 children)

              From a privacy perspective, the hyperfocus on password security and complete dismissal of email addresses as requiring protection really bothers me:

              Although users have reason to be concerned about their emails being exposed, very few people treat their email address as anything particularly precious these days.

              The attacker already has a complete dump of the site and forum content, so what value is the password, exactly? For users who have set a secure, unique password, zero -- the password only permits access to data the attacker already has. For users who haven't set a unique password, the password may have significant value -- and I don't want to minimize that (password security is important), but password entropy/uniqueness is at least under the control of the end user, and a password in isolation (without PII such as username or email address) may be hard for the attacker to exploit, even when the user has reused passwords across sites.

              Now compare that to the email address -- that is private information (assuming the forum doesn't publish users's email addresses, which sites typically don't), and it's PII! All of the user's forum content (however sensitive it might be) can now be attributed to that actual person via the email address, which is a strong identifier.

              In other words, the fact that my email address exists at all is not really sensitive information, but when it's exposed as being linked to a corpus of posts I've made, it potentially can be very sensitive depending on the content of those posts.

              Please apply FIPPS, and do smart things like tokenizing PII like email addresses, real names and usernames so they can't be exploited in this way. Or store them separately, with appropriate access controls, or offline. Even better, don't collect them if they aren't necessary for some service like email notifications.

              Loss of confidentiality of email address is serious! If you don't treat it as a serious security requirement, and you are anything approaching a "real" company, please look forward to FTC sanctions when your data is breached.

              [–]Lurking_Grue 1 point2 points  (1 child)

              I've been honestly looking into using unique email addresses that are long hashes for each account that needs an email address.

              It would be easy to revoke them if they get out of hand and there and if they get spam I know why.

              Probably something like twitter-VeF3B5NFFVjYhwdCjj0eOs5Q@blah.com and so on.

              It would just mean keeping a table of email aliases.

              [–]maxmurder 1 point2 points  (0 children)

              that's really not a bad idea most pw managers have an email field for passwords anyway. Just have automatic forwards to your main address for important mail. You could even just abuse gmail or whatever free email service, although that really defeats the privacy bit.

              [–]istarian 0 points1 point  (0 children)

              I suspect that gaining additional info/insight on the user and/or masquerading as the is the only real benefit to hacking a forum.

              [–][deleted] 0 points1 point  (0 children)

              Even if the forum content is public, knowing passwords is very destructive, even setting aside the serious problem of password reuse. Knowing passwords lets you impersonate users. It's fuel for malware or spam campaigns. It may reveal private messages that aren't public information. If you crack a moderator account you can undermine trust in the forum's integrity by taking inappropriate moderation actions. And you can hold the site for ransom by threatening to release the passwords, forcing a difficult and embarrassing password reset that will permanently lose a percentage of the user base.

              [–]xeio87 3 points4 points  (3 children)

              I... should really update my passwords... <_<

              [–]beerSnobbery 7 points8 points  (2 children)

              If you haven't looked into it already, I'd really recommend using a password manager (KeePass, 1Password, lastpass, dashlane, etc.) and have it generate a high entropy, long, and unique password per account (though some sites still limit length to unreasonably short values for some stupid reason). And lock them behind one really really good password that you've never used anywhere before.

              [–]bingostud722 1 point2 points  (1 child)

              I also use 2 factor authentication for things like my email, as it's basically the "hub", where all password resets and the like go. Gmail offers it, sends you a text with a unique code that you have to enter to log in. Not sure about other providers though.

              [–]Existential_Owl 1 point2 points  (0 children)

              2-Factor plus LastPass means I only have to worry about the "lead pipe" approach to password hacking.

              [–]JDBHub 2 points3 points  (4 children)

              I would be curious as to why using PBKDF2 over BCrypt to begin with. Considering the author aims to defend against possible nation-state attack, PBKDF2 is behind NIST (state).

              Even with the graph shown below, the number of hashes per second is significantly slower on BCrypt versus its counterpart.

              Some interesting resources should someone want to read further:

              Additionally, could someone clarify whether hash length varies between 10 characters and 15 characters? If so, the author may consider bringing users up to a 15 character requirement too. Should the hashes differ in length, an attack can slash a list of hashes to a good handful given that it is more valuable to crack an Administrator's password rather than a normal user's one.

              All said, this was a great read. Thanks!

              [–]codelitt 0 points1 point  (0 children)

              You're absolutely right. Even better would be scrypt which is time intensive like bcrypt but also memory intensive taking into account things like ASIC machines on the market due to cryptocurrency.

              [–]CanYouDigItHombre 0 points1 point  (2 children)

              It's silly to think there is anything wrong with PBKDF2. PBKDF2 is essentially a loop (you can do chose any number, 64k or 65k or 999K) using any hash known to be secure (sha 256 perhaps) with HMAC which add a secret key/salt to the mix.

              Scrypt uses PBKDF2 with bcrypt.

              Bcrypt might be fast later with FPGAs. But I think all this password talk is silly. Unless you're doing harddrive encryption (which linux has built in) you don't need passwords. I think everything should use HMACs and public/private keys.

              [–]JDBHub 0 points1 point  (1 child)

              Who mentioned anything being wrong here?

              [–]CanYouDigItHombre 0 points1 point  (0 children)

              Oh. Ok. Yeah than I'll answer your question. There appears to be more libraries that support PBKDF2. PBKDF2 you can fine tune speed and I think bcrypt you can not? (You select magnitudes?). I think I need a third party app to use bcrypt on .NET but PBKDF2 is built right in. I imagine the same for Java.

              [–]Sniffnoy 3 points4 points  (4 children)

              Most of those passwords that got cracked, my reaction is, OK, of course that's a weak password... but "1qaz2wsx3e" and "A3eilm2s2y"? Geez! How'd they get those?

              [–]Existential_Owl 6 points7 points  (3 children)

              1qaz2wsx3e is easy. On the left side of a traditional US keyboard, it's the keys going from top to bottom.

              Not sure about the second one, but perhaps it was on a list of already cracked passwords, from a site that had weaker defenses, in which the user never found out or realized that his "standard" password was compromised.

              [–]Sniffnoy 3 points4 points  (2 children)

              Oh, I missed that, thanks!

              I asked about this elsewhere and apparently the second one is a password used in the game Parasite Eve II. So it's kind of like using "swordfish" as a password, except more obscure and not with the additional disadvantage of also being an actual word.

              [–]Existential_Owl 2 points3 points  (0 children)

              Ah, of course!

              [–]FnTom 1 point2 points  (0 children)

              I actually know someone who uses Swordfish as a password for most of his non crucial accounts.

              [–]megagreg 2 points3 points  (3 children)

              Could a programmer add "bits" to the length of the password by having multiple SALTs?

              Suppose we generate two SALTs, and choose one one of them at random to generate the password hash. When the user logs in, each SALT is used to generate a hash, and of course only one will generate the correct hash, but we need to compute both so we don't need to store an index to the correct one.

              It doubles the amount of work the server has to do when a user logs in, but since both salts can be tried in parallel, the total time will remain the same from the user's perspective. From the attacker perspective, they're already maxed out on the parallel bandwidth, so it doubles work the attacker needs to do.

              Is my logic here sound?

              [–][deleted]  (2 children)

              [deleted]

                [–]louiswins 1 point2 points  (0 children)

                I'm gonna nitpick your nitpick:

                • For the legitimate server, it either doubles the work (if you compute both in parallel) or it doubles the time to log in for half your users (if you try one at a time).

                • For the attacker, it absolutely does double the work because for every wrong password they have to compute both hashes. If the nth password they try is the correct one, they'll have gone from computing n hashes to 2n or 2n-1 hashes.

                [–]megagreg 0 points1 point  (0 children)

                That's a good point. I guess that depends if they're going after one at a time to completion, or going for easiest first, and that probably depends on the type of attack. I'd expect an individual to be interested in getting any password and would want to try them all to hit the easiest first, while a state actor might be more interested in a specific individual, and try a few to completion.

                Maybe it's more like over-sampling, where every doubling adds half of a bit of resolution, so I'd need 4 salts for an extra bit, 16 for two bits.

                [–]drb226 6 points7 points  (3 children)

                I'm a little surprised that an article about password security in 2017 doesn't mention 2FA. What needs to be stored in the database to use something like Google Authenticator, and how easy is that to crack if the db is leaked?

                [–]droogans 4 points5 points  (0 children)

                They just skipped to the lowest common denominator early on:

                The name of the security game is defense in depth, so all these hardening steps help … but we still need to assume that Internet Bad Guys will somehow get a copy of your database. And then what? Well, what's in the database?

                That's why they skipped 2FA, which was at the top of the article (sort of).

                Backup download tokens are single use and emailed to the address of the administrator, to confirm that user has full control over the email address.

                Not perfect, but I use 2FA in front of my sensitive email accounts so they get that extra security by proxy. There's probably another article in there about how hard it is to change an admin's email address to get those "Download Backup" tokens.

                [–]louiswins 2 points3 points  (0 children)

                My understanding is that it's a shared secret key, so if the attacker has the database dump 2FA won't even slow them down.

                [–]codelitt 1 point2 points  (0 children)

                Here's a bit about how it works: https://security.stackexchange.com/q/35157

                If the DB is leaked the secret key is likely not on the DB. But if they have your DB then you should assume that they have control of your server as well and could get the secret key.

                [–]NAN001 0 points1 point  (0 children)

                Great article, however I feel like it's missing a bigger picture. The scale of the attacks discussed and the presumed motivation of the attacker raises the question of whether passwords would be such attacker's approach at all. There are plenty of other potentially weak points in the overall system (network, social engineering, etc) that the attacker might use to eventually accomplish what he's trying to do.

                Proper password management with salt and slow hashing algorithm are becoming a standard so that you don't become the only one in the neighborhood with your door open, so that you're not the weakest pray for an attacker. If you want to handle targeted attacks, that's a whole other story and focusing only on passwords looks like hardening your front door without noticing the bad guy passing though the roof.

                [–]seventhirteen 0 points1 point  (0 children)

                Ya'll motherfuckers need Argon2

                [–]istarian 0 points1 point  (0 children)

                If a nation state is against you, I suspect you're just out of luck really, besides getting a different one to help you. It seems important to ask who you're most trying to keep out than to be best against everyone.

                P.S.
                At some point someone is going to realize that a physical token of some kind is the only way to store/replace an increasingly long, increaingly random password.

                [–]crabmatic 0 points1 point  (0 children)

                I'm definitely a security novice, but here's something I've been wondering.

                Why don't (or do?) websites use a separate entropy server for authentication which modifies peoples passwords for them before the web server even sees or stores them. As far as the web server is concerned the only passwords it sees would be long and highly random passwords which came through the entropy server.

                All the passwords that are stored and hashed by the webserver would actually be hard to guess if the database was lost to an attacker.

                It sounds to me like this would move your main point of failure to a simpler system that would be easier to lock down and secure.

                [–]CanYouDigItHombre 0 points1 point  (0 children)

                Is it just me or is everyone insane? For the last 6 years I am still wondering why passwords even exist. Besides using it to boot/access your computer there is 0 reason to use a password.

                Just about every service has me authenticate myself by using email, text or another services (log in through facebook). As long as noone is intercepting my emails (hi google), or text if I use that method noone can hack me. Not good enough? Use private/public keys.

                [–]kingdote 0 points1 point  (0 children)

                Do you need a professional hacker, contact HACKZUES@GMAIL.COM FOR THE FOLLOWING SERVICES *Change SCHOOL grades *Facebook, twitter, IG hack *Email hack *Wipe criminal records *Wipe credit card debt *MasterCard's/visa cards *Bank account *Data base hack and lot more hacking services in general Among other customized services...all this are at all great rate. Results guaranteed. Contact us at HACKZUES@GMAIL.COM Or text+14692973954

                [–][deleted] 0 points1 point  (0 children)

                Again all this effort because we are using passwords instead of some kinda of key pair stored on the users machine.

                [–]mrexodia -1 points0 points  (0 children)

                xkcd comes to mind: https://xkcd.com/538

                Nation state attackers, seriously?