Skip to content
Permalink
master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
# Introduction
Passwords are a fact of life now, and they're the first line of
defence when it comes to protecting your accounts online.
But how are passwords stored? What happens if an attacker steals the
list of passwords? Here we find out how passwords are stored as hashes
to make it more difficult to steal them, as well as some of the ways
attackers get around this problem.
A password in a modern computer system is simply a series of
characters. These can be arranged to form something memorable to a
human, but could also come from a special device or piece of software
for managing passwords.
When we log in to a computer, or a remote service, we give our
username and password. That password is then checked against a stored
version. If the username and password match, you're assumed
to be the owner of that account.
Just like with ancient watch words, we have an immediate problem:
eavesdropping. If someone can monitor your conversation, they can make
a note of your password and then later pretend to be you. This is one
of the reasons secure communication is so important. But, for now,
let's assume we have solved that issue.
The typical password scenario requires that everyone has their own
password and don't share. There must also be a list of passwords
somewhere for us to be able to check them when people log-in. This
list of passwords is an issue. The first ever case of a password
system being compromised involved the password list being stolen, and
every year we hear of more cases of stolen passwords, with many
hundreds of lists being discovered shared and traded by hackers. If
someone steals the list of passwords, they have access to all of the
accounts! This is where hashing is used to protect passwords.
# Hashing
The process of "hashing" is easy to explain, but difficult to
correctly implement. Luckily, the hard work has been done and we have
plenty of algorithms to choose from.
If we put some data into a hashing algorithm, we get some apparently
random data back. The same input always leads to the same output, but
you can't work-out the input from the output and any small change int
he input leads to drastic changes in the output.
In password storage, this is used to protect the secret passwords of
users. Rather than keep a list of passwords to check when someone
logs in, which could be stolen, we store a list of *hashes* of
passwords. So, if my password was "swordfish", I would find that
stored against my username was some random-looking data. Since you
can't undo the hashing and turn the random data back into a password,
it's pretty secure, but how is it checked? Well, when I try to log
in, the system takes the password I give it and hashes it, then
compares the output to the stored hash value. If they match, I must
have known the password!
## Undoing Hashes
It's not /entirely/ true to say that we can't reverse a hash. We could
try every possible password, putting each through the hash function,
to see which ones match. This is called a *brute force* attack because
it relies on nothing more than lots of time and computing power.
This is also why password choice is important. Usually, anyone trying
to get into your account would be unable to keep guessing your
password and hope to get in. Most systems would cause a small delay
after an unsuccessful attempt, making it impossible to try every
password in a reasonable amount of time. If the list of hashes is
stolen, however, an attacker could try lots of passwords much faster
because they can compare to the hash themselves.
It might seem that it's just as difficult to brute-force the password
'mittens' when we only know the hash as it would be to find, by brute
force, the password 'sdfhg.28b!8GGG=' from its hash.
However, if we did try to brute-force these two, we would find the
answer to the first one if we tried every word in the dictionary. It
might take a while, but it could be done without too much
difficulty. There are around 150,000 English words in common use --
let's say we include archaic and derivative words and make the total
number of words 250,000. Assume it takes 1 second per hash. That means
it would take around 3 days to generate a hash of every possible word,
assuming a single thread and no parallelism. To find the second one,
we would have to try every possible combination of letters, numbers
and punctuation. Assume we limit the search to passwords of up to
length 15 and there are 72 possible characters (26 lowercase, 26
uppercase, 10 digits and 10 punctuation marks), then the total number
of possible passwords is: $72+72^2+72^3+\ldots +72^{15}=7.36×10^{27}$, which would
take around 230,000,000,000,000,000,000 years at the rate of 1 hash
per second.
Even worse, attackers have access to lists of previously calculated
hashes, too. This means that there is probably already a database
entry for 'mittens' and its hash, so 'reversing' it is really just a
matter of looking up the answer. Attackers also use special tools to
create lists of words with added numbers on the end, or replacing
certain letters because of how often people use this to meet password
requirements.
# Conclusion
So, next time you pick a password, remember that it's not someone
trying to guess that is the problem, but someone possibly using big
databases of password-like words to reverse the hash!