Most of the websites require their users to authenticate themselves with a username and password.
Then, they compare the supplied credentials from the user with the data that is stored in the database. If the credentials match, the user is granted access. If not, typically an alert pops up saying that the “Username or password is incorrect”.
This all sounds pretty good, right?
But, the question rises to what will happen if the database where the credentials are stored gets compromised? Well, this article covers different techniques to store a password in the database.
According to this article from naked security, roughly 55% of net users use the same password for multiple websites! This is a direct indication that if by any chance any of these websites get compromised, your password would be taken away. And further, not only will they grant access to that particular website, but to all other websites too.
Well, you might be wondering what can be done if your database is compromised and exposed? You might think that if a person breaks into your database, he might get all the information and data, right?
There are tons of ways through which the process of retrieving the password from the breaker can be made cumbersome for him.
There are over 30% of websites that store your password in the database in plain text. This happens when the developers tend to ignore the guidelines of storing your password to the database.
Needless to say, if your password is stored as plain text it doesn't really matter how strong it is.
Storing plain text passwords should be illegal
You might think that, well if not plain text then we must somehow encrypt the password and store it.
Encryption functions do provide a one-one mapping between the input and the output – and they're always reversible. If a breaker gets the key, he will be able to decrypt the password. The better and more sophisticated way would be to use a one-way cryptographic hash function.
A good cryptographic has function has a lesser number of Collisions – for different input values to the function it's difficult to get the same output.
Collisions can't be completely avoided, however, because of the pigeonhole principle.
For hashing passwords, we can just assume that the hash function will generate a unique output.
Some of the most popular cryptographic hash functions are MD5 and SHA1. Instead of storing plain text passwords to the database, you could just store a hash password in the database. You might be wondering how we could retrieve the password from the hash, right?
It's actually pretty simple. By applying the same hash function on the password which the user entered, and comparing it to the hash stored in the database. If they both match, permission is granted.
And sure enough, if someone would break into the database all he would be able to view will be the hash output – not the password itself.
Use cryptographic hash functions
People who break into databases are very smart, and once they got the knowledge that developers are storing hashed passwords, they got to work. They pre-computed hash of a large number of words from a dictionary, and created a table with the corresponding hashes.
The table is now well-known as the Rainbow Table and is already available online. They might use this table to reverse the actual password by comparing the hashes obtained from the database.
Hence, it's very important to have a strong password since the possibility of your password appearing in the Rainbow Table becomes dimmer.
Nowadays, simply storing the hash of a password is not so effective as it was before. Processing powers have increased tremendously with the introduction of the new GPUs.
Actually, a fast GPU might generate millions of MD5/SHA1 hashes in a second! Using this, the people who break into databases can easily generate a large number of hashes by brute-force different possible combinations and comparing them to the hashes in the database.
Not to your salad, but to your password.
Basically a salt is a random data that is concatenated with your password before sending it as a hashing function.
For instance, if your password is abc and the salt is !Za1o49, the result would be hashFunction(‘abc!Za1o49'), and will be stored this way into the database. Without the salt, it would've been stored like hashFunction(‘abc').
This way, the Rainbow Table can't be effective here, as the probability of having a row containing “abc!Za1o49″ is not that high.
To state things correct, the salt is not stored in the database, and only present in the application configuration file – which is not accessible from the outer world. In order to gain access to the source files would be more difficult than gaining access to the database.
Of course, the salt method above is very static. A better approach would be a dynamic salt.
For every user, a brand new salt is generated by a random string generator. The password that is entered by the user is concatenated with a randomly generated salt, as well as a static salt.
The concatenated string is passed as the input of the hashing function, and the result is stored in the database.
However, dynamic salt is required to be stored in the database, since it's different for each user. When the user is tested for authentication, the first value of the dynamic salt for that particular user is fetched from the database. Then, it's concatenated with the input supplied by the user and the static salt.
The result is then compared to the hash in the database.
But, if the database is compromised, the breaker will not only get the password hashes but also the dynamic salt used. Now, you might be wondering then “What is the advantage of dynamic salt over static salt?”.
Even if the attacker somehow got the dynamic salt, he still needs to create a new hash-table for every user present in the database. This is obviously more expensive as an operation than creating just one table for all the users.
The above approach is very efficient in slowing down the attack. Of course, it's highly recommended to use algorithms like bcrypt or scrypt instead of the MD5/SHA1.
The Bcrypt is a hashing algorithm that is based on Blowfish – it requires you to specify a cost factor. The cost factor will make the overall process much slower and the time taken to generate a hash-table would increase drastically.
If you want to add more information on “How to store a password in the database?”, please contact us.