st17_passwords.md

# Introduction
Passwords are a fact of life now, and they're the first line of
defence when it comes to protecting your accounts online.

But how are passwords stored? What happens if an attacker steals the
list of passwords? Here we find out how passwords are stored as hashes
to make it more difficult to steal them, as well as some of the ways
attackers get around this problem.

A password in a modern computer system is simply a series of
characters.  These can be arranged to form something memorable to a
human, but could also come from a special device or piece of software
for managing passwords.

When we log in to a computer, or a remote service, we give our
username and password.  That password is then checked against a stored
version. If the username and password match, you're assumed
to be the owner of that account.


Just like with ancient watch words, we have an immediate problem:
eavesdropping. If someone can monitor your conversation, they can make
a note of your password and then later pretend to be you.  This is one
of the reasons secure communication is so important. But, for now,
let's assume we have solved that issue.

The typical password scenario requires that everyone has their own
password and don't share. There must also be a list of passwords
somewhere for us to be able to check them when people log-in. This
list of passwords is an issue. The first ever case of a password
system being compromised involved the password list being stolen, and
every year we hear of more cases of stolen passwords, with many
hundreds of lists being discovered shared and traded by hackers. If
someone steals the list of passwords, they have access to all of the
accounts!  This is where hashing is used to protect passwords.


# Hashing

The process of "hashing" is easy to explain, but difficult to
correctly implement.  Luckily, the hard work has been done and we have
plenty of algorithms to choose from.

If we put some data into a hashing algorithm, we get some apparently
random data back.  The same input always leads to the same output, but
you can't work-out the input from the output and any small change int
he input leads to drastic changes in the output.

In password storage, this is used to protect the secret passwords of
users.  Rather than keep a list of passwords to check when someone
logs in, which could be stolen, we store a list of *hashes* of
passwords.  So, if my password was "swordfish", I would find that
stored against my username was some random-looking data.  Since you
can't undo the hashing and turn the random data back into a password,
it's pretty secure, but how is it checked?  Well, when I try to log
in, the system takes the password I give it and hashes it, then
compares the output to the stored hash value.  If they match, I must
have known the password!

## Undoing Hashes

It's not /entirely/ true to say that we can't reverse a hash. We could
try every possible password, putting each through the hash function,
to see which ones match. This is called a *brute force* attack because
it relies on nothing more than lots of time and computing power.

This is also why password choice is important.  Usually, anyone trying
to get into your account would be unable to keep guessing your
password and hope to get in.  Most systems would cause a small delay
after an unsuccessful attempt, making it impossible to try every
password in a reasonable amount of time.  If the list of hashes is
stolen, however, an attacker could try lots of passwords much faster
because they can compare to the hash themselves.

It might seem that it's just as difficult to brute-force the password
'mittens' when we only know the hash as it would be to find, by brute
force, the password 'sdfhg.28b!8GGG=' from its hash.

However, if we did try to brute-force these two, we would find the
answer to the first one if we tried every word in the dictionary. It
might take a while, but it could be done without too much
difficulty. There are around 150,000 English words in common use --
let's say we include archaic and derivative words and make the total
number of words 250,000. Assume it takes 1 second per hash. That means
it would take around 3 days to generate a hash of every possible word,
assuming a single thread and no parallelism. To find the second one,
we would have to try every possible combination of letters, numbers
and punctuation. Assume we limit the search to passwords of up to
length 15 and there are 72 possible characters (26 lowercase, 26
uppercase, 10 digits and 10 punctuation marks), then the total number
of possible passwords is: $72+72^2+72^3+\ldots +72^{15}=7.36×10^{27}$, which would
take around 230,000,000,000,000,000,000 years at the rate of 1 hash
per second.

Even worse, attackers have access to lists of previously calculated
hashes, too. This means that there is probably already a database
entry for 'mittens' and its hash, so 'reversing' it is really just a
matter of looking up the answer. Attackers also use special tools to
create lists of words with added numbers on the end, or replacing
certain letters because of how often people use this to meet password
requirements.

# Conclusion

So, next time you pick a password, remember that it's not someone
trying to guess that is the problem, but someone possibly using big
databases of password-like words to reverse the hash!
	# Introduction
	Passwords are a fact of life now, and they're the first line of
	defence when it comes to protecting your accounts online.

	But how are passwords stored? What happens if an attacker steals the
	list of passwords? Here we find out how passwords are stored as hashes
	to make it more difficult to steal them, as well as some of the ways
	attackers get around this problem.

	A password in a modern computer system is simply a series of
	characters. These can be arranged to form something memorable to a
	human, but could also come from a special device or piece of software
	for managing passwords.

	When we log in to a computer, or a remote service, we give our
	username and password. That password is then checked against a stored
	version. If the username and password match, you're assumed
	to be the owner of that account.


	Just like with ancient watch words, we have an immediate problem:
	eavesdropping. If someone can monitor your conversation, they can make
	a note of your password and then later pretend to be you. This is one
	of the reasons secure communication is so important. But, for now,
	let's assume we have solved that issue.

	The typical password scenario requires that everyone has their own
	password and don't share. There must also be a list of passwords
	somewhere for us to be able to check them when people log-in. This
	list of passwords is an issue. The first ever case of a password
	system being compromised involved the password list being stolen, and
	every year we hear of more cases of stolen passwords, with many
	hundreds of lists being discovered shared and traded by hackers. If
	someone steals the list of passwords, they have access to all of the
	accounts! This is where hashing is used to protect passwords.


	# Hashing

	The process of "hashing" is easy to explain, but difficult to
	correctly implement. Luckily, the hard work has been done and we have
	plenty of algorithms to choose from.

	If we put some data into a hashing algorithm, we get some apparently
	random data back. The same input always leads to the same output, but
	you can't work-out the input from the output and any small change int
	he input leads to drastic changes in the output.

	In password storage, this is used to protect the secret passwords of
	users. Rather than keep a list of passwords to check when someone
	logs in, which could be stolen, we store a list of hashes of
	passwords. So, if my password was "swordfish", I would find that
	stored against my username was some random-looking data. Since you
	can't undo the hashing and turn the random data back into a password,
	it's pretty secure, but how is it checked? Well, when I try to log
	in, the system takes the password I give it and hashes it, then
	compares the output to the stored hash value. If they match, I must
	have known the password!

	## Undoing Hashes

	It's not /entirely/ true to say that we can't reverse a hash. We could
	try every possible password, putting each through the hash function,
	to see which ones match. This is called a brute force attack because
	it relies on nothing more than lots of time and computing power.

	This is also why password choice is important. Usually, anyone trying
	to get into your account would be unable to keep guessing your
	password and hope to get in. Most systems would cause a small delay
	after an unsuccessful attempt, making it impossible to try every
	password in a reasonable amount of time. If the list of hashes is
	stolen, however, an attacker could try lots of passwords much faster
	because they can compare to the hash themselves.

	It might seem that it's just as difficult to brute-force the password
	'mittens' when we only know the hash as it would be to find, by brute
	force, the password 'sdfhg.28b!8GGG=' from its hash.

	However, if we did try to brute-force these two, we would find the
	answer to the first one if we tried every word in the dictionary. It
	might take a while, but it could be done without too much
	difficulty. There are around 150,000 English words in common use --
	let's say we include archaic and derivative words and make the total
	number of words 250,000. Assume it takes 1 second per hash. That means
	it would take around 3 days to generate a hash of every possible word,
	assuming a single thread and no parallelism. To find the second one,
	we would have to try every possible combination of letters, numbers
	and punctuation. Assume we limit the search to passwords of up to
	length 15 and there are 72 possible characters (26 lowercase, 26
	uppercase, 10 digits and 10 punctuation marks), then the total number
	of possible passwords is: $72+72^2+72^3+\ldots +72^{15}=7.36×10^{27}$, which would
	take around 230,000,000,000,000,000,000 years at the rate of 1 hash
	per second.

	Even worse, attackers have access to lists of previously calculated
	hashes, too. This means that there is probably already a database
	entry for 'mittens' and its hash, so 'reversing' it is really just a
	matter of looking up the answer. Attackers also use special tools to
	create lists of words with added numbers on the end, or replacing
	certain letters because of how often people use this to meet password
	requirements.

	# Conclusion

	So, next time you pick a password, remember that it's not someone
	trying to guess that is the problem, but someone possibly using big
	databases of password-like words to reverse the hash!