A meaningful sentence is less secure than a random list of randomly-chosen words because the key space is smaller. There are fewer meaningful combinations of words than there are random combinations of words.

A hack attack does not have to be a brute force attack.

When most people think about hacking a password, they usually think of one of two things: either trying every possible combination of characters over and over again (a, then aa, then ab, then ac, and so on) or trying a dictionary attack (password, god, money, secret, letmein, sex, abc123, and so on). And both of those kinds of brute force attacks do sometimes get used.

But it's more common that a hacker will break into a Web site and steal the list of hashed passwords.

When passwords are stored on a server, they are stored as "hashes". Essentially, the Web site takes the password you entered, performs some mathematical operations on it, and then stores the result, which is called a "hash". (Note that I've simplified a little bit here, but what follows is basically correct.)

As an example (this is a hypothetical example only, and doesn't correspond to a real-world hashing function), when you create an account on a Web site, the Web site might:

1. Take the password you type in and add "andsosayweall" to the end of it. (Adding something to the end of a password is called "salting" the password, and helps protect against "rainbow table" hack attacks);

2. Take the first letter of the password, do a binary XOR of that and the second letter, take the result and do a binary AND of the third letter, take the result and do a binary NOT of the fourth letter, and save the result;

3. Repeat step 2 for the fifth, sixth, seventh, and eighth letter, and then repeat step 2 for the ninth, tenth, eleventh, and twelfth letter; and so on until the whole password has been processed;

4. Add up all the results from these operations; and

5. Store the result in the database with your user name.

The password itself is never stored in the database. Instead, only the hash is stored. When you log on to the Web site, the Web site takes the password you type in, adds "andsosayweall" to the end, hashes it, then compares the hash to the hash in the database. If they are the same, it lets you in. If they are not the same, it does not.

It is done this way so that if hackers break into the Web site and steal the database, they will have a list of usernames, but they WON'T have a list of passwords. They will only have a list of password HASHES. You can't take a hash and figure out the password from it.

But the fact that the passwords are hashed does give the hackers some leverage. What they can try to do is they can try to hash a lot of different words until they find a word that has a hash that matches one of the hashes in the database they stole. They are not trying to break into a single account; instead, they have a list of, say, 10,000,000 usernames and hashes, and they just keep doing different hashing techniques until they find a word whose hash matches something in the list. Then they have a password that matches one of the accounts in the list and they can break into that poor luckless schmoe's account. (Aha! We have a list of usernames and hashes. We see that one of the things in the list looks like

Tacit 377564783854687485

and when we do a hash of "LetMeIn" the result is 377564783854687485, so we know that Tacit is using LetMeIn as a password!)

Because the passwords are hashed, the security of the list of usernames and hashes rests a great deal on how complex and secure the hashing algorithm is. If the hashing algorithm is simple, it's easy to deduce what the password is. Also, some hashing algorithms might produce certain numbers if you feed them something like aaaaaaaaaaaaaaa as a password, which makes a password that's simple like that easy to spot in the list of hashes.

Some hackers will use "rainbow tables" to try to break a list of hashes. A rainbow table is kind of like a dictionary attack, sort of. It's an enormous list of common passwords that have been hashed using common hashing algorithms, so all the hacker has to do is compare the stolen database of usernames and hashed passwords with the hashes in his rainbow table and he can say "Aha! I just figured out 176,543,810 of the passwords in this list of five million accounts I stole!"

That's why Web sites will add some sequence of letters, like "sosayweall" or whatever, to the password that you type before it gets hashed. A rainbow table takes a huge amount of time even on a supercomputer to create. If you have a rainbow table of common passwords like "money" and "sex" and "love" and "letmein" and you steal a database where everything has been salted, the rainbow table is worthless because all the passwords on that Web site are actually "moneysosayweall" and "sexsosayweall" and "lovesosayweall" and "letmeinsosayweall" before they are hashed, so the hashes in the rainbow table won't match any of the hashes in the stolen database.

The various techniques used to create the hashes are pretty well known. Some techniques have security weaknesses that allow an attacker to look at a hash and be able to guess what combination of letters might have created that hash. Insecure hashes, passwords with lots of repeating characters (like "aaaaaaaaaa"), and hashes that are created without salt are often much easier to crack, just because of the mathematics of how the hashing works.

It is also possible that there might be hashing "collisions"--two different passwords that create the same hash. For example, because of the mathematics, a password like "IAmALittleGreenFrog" and "4vc?)bb%vs554xx@<,,nNb2z" might end up having exactly the same hash, just because the math that is used for the hashing produces the same result for both of those strings. That means either one of those passwords will work--the Web site thinks they are the same password!

Hashing algorithms that produce a lot collisions are weaker than hashing algorithms that are not. It is sometimes possible to study a hashing algorithm and be able to create a collision--a string that hashes the same way that the password you're trying to crack hashes--so as to engineer a break of the account by using a collision. This is how the Flame malware writers were able to send out their malware disguised as official Microsoft system updates; they created a collision of the hash that Microsoft used when Microsoft created their security certificate that they use to sign their real system updates.


Photo gallery, all about me, and more: www.xeromag.com/franklin.html