Security Guide: Passwords 101

It comes up all the time on the web: pick a strong password! But what does that really mean? There are many misconceptions about passwords that can lead to trouble. Insecure passwords are very vulnerable and subject to attack. Before you can understand how to prevent these issues, you must first understand how passwords are used and stored.

When you sign up on any (legit and reputable) website, you are asked to provide a password to accompany your username. What you may not know about this password is that it is not stored on the server as plaintext, that is, text that is in a readable state; rather, your password is stored in what is known as a hash. What a hash function does is take your input and converts it to a seemingly random fixed-size string. The key is, it is not random.
For example, we will look at the MD5 Hash Function (security flaws were found in this function in 2005, but it is still widely used):

If we input our password as besttechie the hash function would return as follows:

MD5(‘besttechie’) –> acb07956f67dcbf58a9b90c2b48c9204

As you can see, the string produced is 32 characters long, and a mix of letters and numbers. This output is the hash. It appears to be random, but is not. No matter how many times you put besttechie through the MD5 function, the hash will always be the same.

So what does this have to do with my passwords?

As already stated, the server does not store your password in plaintext, it stores a hash. So, if you sign up for a website, and select besttechie as your password, the server (assuming it uses the raw MD5 function) will store acb07956f67dcbf58a9b90c2b48c9204 not besttechie.

So how does this work?

Now that you have your password hash in the server’s database, you want to login. You obviously aren’t going to login with the hash–you are never told what your password’s hash was, and it’s much too hard to remember–you are going to login with your normal password. What the website does on a login is take the password you enter, run it through the hash function, and compares the result to the hash that was generated when you registered. If they match, you can login. If not, login fails.

Take a look at this example where we enter the wrong password. We will accidentally enter besttechir:

MD5(‘besttechir’) –> 0ec195f31983ffa8bf2100d5d2295beb

As you can see, just one letter change will render a completely different hash. When the hash for besttechir is compared to the hash of besttechie, they do not match, and the login fails.

Why use hashes?

Hashes are used to protect you, the user. Imagine if every password was stored in a database in plaintext. Your first security threat is the server administrator himself. What’s stopping him from just opening up the database and writing down everyone’s passwords? Nothing. Second, this is protecting you from hackers. If someone ever broke into a database that held plaintext passwords, they can steal your information.

Hashes help prevent this.

If the server administrator or a hacker ever gets nosy and wants to take a look, he’ll never see that your password is besttechie. He will only see acb07956f67dcbf58a9b90c2b48c9204 and will have no idea what that means.

The beauty of these hash functions is that they are not reversible. You cannot start with acb07956f67dcbf58a9b90c2b48c9204, run it through some inverse-hash function, and come out with besttechie.

But hashes are not invincible.

There are ways that hashes can be cracked to reveal the original plaintext that generated it. The three most common methods are: brute force attacks, dictionary attacks, and rainbow table attacks.

Brute force attacks will attempt to generate every possible hash until it finds one that matches the one it is attacking. As you might imagine, this can take a very long time.

Think about it this way: An MD5 hash is a 128 bit hash value. That means that there are 2^128 or 340,282,366,920,938,463,463,374,607,431,768,211,456 possible hashes. Even if you had a computer that could check a billion billion (10^18) hashes per second, it would still take 10,790,283,070,806 years to check all of them.
Granted, if you know the length of the plaintext, and the allowed characters, and you have a smaller hash the attack will take less time, but that’s still a long time.

Dictionary Attacks are slightly different. Rather than you attempting every possible combination, you simply lookup your hash. There are places where they have recorded the hashes to many many different plaintexts, and all you have to do is enter the hash and it will give you the corresponding plaintext. Of course, due to the time it would take, they cannot possibly record all of them. Because of this, dictionary attacks aren’t very effective as their range is quite limited.

Rainbow tables employ the time-memory tradeoff. What this does is reduces the amount of time to find a plaintext by storing more information in the computer’s memory. A rainbow table is constructed by building chains of possible plaintext passwords. Each chain is developed by starting with a randomly selected plaintext and then successively applying the hash function followed by a reduction function. The reduction function takes the hash and turns it into another plaintext password guess. It continues this process until instructed to stop. The intermediate password guesses are then discarded and only the first and last are stored in the rainbow table.

Recovery of plaintext passwords is then done by taking the hash, applying the reduction function, and looking-up the result in the rainbow table. If no match is found, then the hash and reduction functions are applied again and that result is then looked up. This is repeated until a match is found. Once a match is found, the chain that resulted in the match is reconstructed to find the previously discarded intermediate value, which is then a plaintext password for the given hash.

How much faster is this? Here’s a good example: A popular tool that employs rainbow tables can crack the password Fgpyyih804423 in 160 seconds.

How Can You Protect Yourself?

Most importantly, choose a password that is long and complex. Use uppercase letters, lowercase letters, numbers, and symbols if you are allowed. The longer and more complex your password is, the longer and less likely a crack is. Long and complex passwords are less likely to fall victim to brute force or dictionary attacks.

A smart webmaster can help protect his users from rainbow table attacks by using more complex hashing functions and a concept known as ‘salting’. There are many methods of salting which include adding an unknown string of text to the plaintext before hashing, and hash multiple times (or in multiple places).

Hopefully this article has demonstrated the basic and intermediate concepts of passwords. You should no longer think “Well, no one will guess my password is ‘tree’.” That doesn’t matter. They don’t have to guess it, it can be cracked. Users need to pick secure passwords, and webmasters need to help protect their users.