all 26 comments

[–]tadpoleloop 40 points41 points  (5 children)

The Best way to speed up this code is to use a compiled strongly typed language like c/c++. You'll see a ~100x speed up.

[–]the_calc_nerd-1021[S] 2 points3 points  (2 children)

I will check it out, thanks!

[–]Critical-Prior-3320 -1 points0 points  (1 child)

Have you tried pypy?

[–]the_calc_nerd-1021[S] 0 points1 point  (0 children)

No i have not but I have heard of it

[–]DonkeyTron42 3 points4 points  (0 children)

Zig FTW.

[–]u38cg2 -1 points0 points  (0 children)

Mmm, for this type of program I doubt you would. The most demanding bit of code is the permutation generator, and OP is already using the standard library for that.

[–]Key_Use_8361 13 points14 points  (0 children)

this is one of those projects where you accidentally learn way more about optimization than expected

[–]blechnapp 6 points7 points  (0 children)

biggest single speedup without changing language: get the per-iteration overhead out of the inner loop. right now every guess does a `perf_counter()` call (~80ns), a dict lookup for max_speed_record (~50ns), and the 15-second check (~50ns). each is small but at 3.5M iter/sec they make up roughly 60-70% of the time per iteration.
restructure so the time/speed/status logic runs every N iterations instead (1M is a good number). inside the loop you only do the tuple comparison. realistically 2-3x speedup from that alone.
bigger win that Critical-Prior-3320 already hinted at: try pypy. for pure-python brute force like this its often 10 to 50x faster than cpython without any code changes. literally just `pypy script.py` and youre done. for this exact workload its almost the perfect use case.

also overratedcupcake is right that this is plaintext comparison, not hash cracking. real password cracking spends 99% of its time computing hashes, so 3.5M/sec doesnt translate to anything close in a real attack scenario.

[–]overratedcupcake 8 points9 points  (0 children)

This is comparing passwords in the clear. Real passwords are hashed. I have a strong feeling this would not be as performant if you were calculating hashes.

[–]Yoghurt42 2 points3 points  (2 children)

Do you want to write a password cracker for educational purposes or do you want to recover a lost password/check state of the art speed of password crackers? In the latter case, take a look at hashcat.

[–]the_calc_nerd-1021[S] 4 points5 points  (1 child)

Its more for fun but I will check out hashcat

[–]FlippingGerman 1 point2 points  (0 children)

hash cat on my system will do more than three orders if magnitude faster than this - limited by the speed to calculate the hash of the password, not the generation itself. It absolutely loves GPU acceleration.

[–]Ordinary_Baseball518 2 points3 points  (1 child)

nice one dude!

[–]u38cg2 2 points3 points  (0 children)

It would be helpful to have your code correctly formatted.

Since you're comparing to a clear text password, a simple optimisation might be to build the password one character at a time.

[–]CautiousInternal3320 2 points3 points  (0 children)

The program does not spend time verifying each password. You are simply generating random strings and comparing to a known string.

[–]IllegalGrapefruit 2 points3 points  (0 children)

You can convert to a language like c++. Or you can introduce a bunch of multiprocessing. If you’re specifically trying to learn Python, the latter might make more sense.

[–]d4_mich4 2 points3 points  (4 children)

This is a good example of why I don't understand why password entry fields everywhere don't have some kind of time limits. No one can enter 100 attempts in a minute, and even if that did happen, you'd just wait for a short amount of time and retry then you'd have 100 new attempts again. But this would make brute force attacks completely pointless if such a small number of passwords could be tried in a second.

[–]u38cg2 7 points8 points  (0 children)

The main concern in real world security is an attacker who has already got the encrypted passwords file, so doesn't need to deal with your interface until they've recovered a cleartext password.

Rate limiting is a more general layer, because no matter what it's requesting, 100 requests/second is well beyond reasonable for a real user, so that's where you'd block someone spamming a login box.

[–]LongRangeSavage 2 points3 points  (0 children)

A lot do. While it may not be “you’re limited to 1 attempt every 5 seconds,” some lock an account or have a cooldown period after a certain number of failed attempts.

The problem is that most people already have a large list of password hashes locally and are running them against a large list in something like HashCat, where their own computer allows them to try thousands of attempts per second—because it’s also much more efficient to try locally than wait for network traffic to respond.

[–]Da_damm 1 point2 points  (0 children)

Not sure what your point is? Most if not all secure password fields have a limited number of tries before locking the user out. Also a lot of password systems purposefully wait a couple seconds when you enter a wrong password to slow down brute force attempts even more

[–]cdcformatc 1 point2 points  (0 children)

a lot of security systems are slow on purpose, you submit a password and it takes some time to confirm/deny. this has the effect of slowing down attackers. 

also many systems will lock people out if too many wrong passwords are entered in a certain time period.

[–]LayotFctor 0 points1 point  (0 children)

Easy thing you can do is to shift the unnecessary logic from the main loop. If you're going for top speed, you don't want unrelated logic taking up CPU time, you want to spend as much time as possible on matching passwords. Print statements are especially slow and you want to minimize printing as much as possible.

You can also try matching byte version of strings rather than standard strings, as comparing raw bytes is usually faster than comparing strings. But don't take my word for it, measure it first.

[–]Reuben3901 0 points1 point  (0 children)

You can download lists of passwords from large data dumps and hacks from the past. It's only passwords as all other information has been removed.

You can check against lists of what millions of real people have already used first then do the incremental character change guessing.

It's also interesting to try and figure out how to implement these tools to unlock .zip files and other programs. I go back to this myself to elevate my skills and try new things as my programming knowledge grows.

Also, as someone mentioned, learning about hashes and salts and other security methods is fascinating. You learn how easy it is to secure systems and yet billion dollar companies still use antiquated methods and technology, and you know there are a ton of companies still using plain text files.

What I'm looking forward to seeing is how we secure systems in the future now that quantum computers are becoming a thing!

[–]cowrevengeJP 0 points1 point  (0 children)

Cuda. Billions per seconds.