Hashing Lab Task: Implementing Hashing in your Programs
This is a more practical session for you to work through.
In this session we are going to look at implementing hashing in our programs. We will use Python for this, as its something you should all be familiar with.
To demonstrate the flaws in hashing approaches, we will then build a simple hash cracking program. This should let us see how salts, and the hash itself can make cracking passwords much more difficult.
Supporting Documentation
Docs for the python hashlib library
Python 3
We will be using python 3 for these examples. They should work fine with python2. But if you get problems like "Bytestring expected", you may need to remove the encode option.
Info
Example code for this session can be found in the 6005-CEM-Codebase Repository
Getting Started MD5 Hashing
Lets start with the well known MD5 hashing system.
No Longer Best Practice
As we know from the Materials MD5 is no longer considered secure. But it will give us the chance to see how easy it is to break this system using inline tools.
Lets create our first hash
Remember python3 makes a distinction between string
and bytestring
so we will need to convert
import hashlib
plaintext = b"foobar" #Note we need it to be a byte string
theHash = hashlib.md5(plaintext).hexdigest() #Convert and get the message digest
print(theHash)
import hashlib
plaintext = "foobar" #This time we have a string as input
theHash = hashlib.md5(plaintext.encode()).hexdigest() #Note we need to encode the string
print(theHash)
TASK: MD5 and Python
There are several tools we can use to compare hashes.
Write a program that takes user input and turns it into a hashed value for the moment just use MD5 as the hash
- Hash a few words (don't use your password)
- Compare the output of your program with that given by Cyberchef
- Hash the term
password
- Use Crackstation to see if the hash is in its dictionaries
- What about
Passw0rd!
(Using a "good" password?)
Using a Stronger Hash: SHA256
While MD5 is now not recommend to be used. The SHA family of hashes are still considered Secure (as long as we have byte sizes of 256 bytes and above).
We can modify our code to make use of the SHA
family of hashes
import hashlib
plaintext = b"foobar" #Note we need it to be a byte string
theHash = hashlib.sha256(plaintext).hexdigest() #Convert and get the message digest
print(theHash)
import hashlib
plaintext = "foobar" #This time we have a string as input
theHash = hashlib.sha256(plaintext.encode()).hexdigest() #Note we need to encode the string
print(theHash)
TASK: SHA256 and Python
Update your code to use the SHA256 Hash, and repeat the task
- Compare the output of your program with that given by Cyberchef
- Use Crackstation to see if the hash is in its dictionaries
- What about
Passw0rd!
(Using a "good" password?)
The Greater the number of bytes the more "secure" the hash is (ie 256 < 512).
- Rework the code to use SHA512. Does Crackstation still find it ??
Password Cracking
While hashing might make it impossible to derive a plain text password from its hash, flaws in the way we hash the password itself can make it easier for an attacker to find it.
The approach we might take to build a password cracker would be:
- Get a list of common passwords
- For each item in the list
- Check the hash of
item
against our target hash - If we have a match. We have found the password
- Check the hash of
In python (and the md5) this looks something like this
import hashlib
#Our List of passwords
passwordList = ["foo", "bar", "baz"]
#Our Target Hash
target = "37b51d194a7513e45b56f6524f2d51f2" #bar from cyberchef
#Loop through all items in the password List
for item in passwordList:
#Create a hash for the item
theHash = hashlib.md5(item.encode()).hexdigest()
#Compare to our target
if theHash == target:
print ("Password Match found {0}".format(item))
Finding Lists of passwords
So where do we get a list of common passwords from?
There are several lists available freely on the web. For example seclists has lists of common:
- usernames
- passwords (The top 10 Million)
- admin URLS
Note
Yep, another hacking myth busted. There is no phoning our dark web contact on a burner mobile. Then meeting (in hoodies obviously) in a graveyard somewhere to swap USB sticks
Lets grab the top 10,000 passwords from the list and update our program.
import hashlib
import time #So we can see how long things take
#Our List of passwords in a file
fd = open("10-million-password-list-top-10000.txt", "r")
#Our Target Hash
target = "dd8fcb2c31ee2c6ebbc63f8cf22e7c16" #Unknown Hash
startTime = time.time()
match = None
#Loop through all items in the password List
for item in fd:
item = item.strip() #Remove New Lines
print ("Chekking {0}".format(item))
#Create a hash for the item
theHash = hashlib.md5(item.encode()).hexdigest()
#Compare to our target
if theHash == target:
print ("Password Match found {0}".format(item))
match = item
endTime = time.time()
print "Match is {0}".format(match)
print "Total of {0} Seconds".format(endTime - startTime)
TASK: Cleaning up
There is a a lot of Noise in our output.
- Turn the code into a function
- Stick the Code in the Test Harness (see Below)
- Clean up the output so we just display any matches
Test Harness Code (in the Repo)
import time
import hashlib
def simpleCrack(wordlist, target):
"""
Code to crack an md5 hash using a dictionary
@param wordlist: List of dictionary Words
@param target: The Hash we want to crack
"""
pass #Your code goes here.
if __name__ == "__main__":
#Our word list
wordlist = open("10-million-password-list-top-10000.txt", "r")
#Our Target Hash
target = "dd8fcb2c31ee2c6ebbc63f8cf22e7c16" #Unknown Hash
startTime = time.time()
simpleCrack(wordlist, target)
endTime = time.time()
print ("Total of {0} Seconds".format(endTime - startTime))
TASK: Checking Multiple Passwords
Lets imagine we have a list of password hashes to break We have two approches we can take here
- Rerun the code for each password hash.
- Pre compile a list of hashes and check against them
Create a new function called multiplePass.
Use the code you developed, and one of the approaches above to crack the following list
of Hashes. Whats the difference in the Time Taken?
- 24eb05d18318ac2db8b2b959315d10f2
- ab4f63f9ac65152575886860dde480a1
- 11a7f956c37bf0459e9c80b16cc72107
- 0e2b0bb3a925c7001d953983f9de9823
- bd1e13bdaab82581d4dc299eb9a3da0f
Adding Some Salt
You should have seen that its pretty easy to crack a decent number of passwords (even on the uni hardware) in a short period of time.
Even more importantly, you can see that as the hash for a given value wont change, we can pre-calculate all of the possible hashes, then just lookup the hashed password against our list. Meaning we do lots of work once.
A salt gives us a way of stopping this kind of lookup attack happening. As it would mean that we need to create a list of hashes for every possible salt value.
A Simple Salting Approach.
Lets start with a simple, Hardcoded (and therefore static) salt. I am going to chose to add the word SALT to the plaintext.
A function to salt the password using this can look something like this.
def simpleSalt(plaintext):
return "{0}SALT".format(plaintext)
plaintext = "foobar"
salted=simpleSalt(plaintext)
# Salted is now "foobarSALT"
doHashing(salted)
TASK: Cracking Salted Passwords
The following list of hashes have been salted using the strategy above. (They are SIMPLE_SALT in the repo)
- Create a new function to deal with the simple salt and try to break the hashes
- Compare the times of the two functions. Is there a significant difference?
- What does this tell us about our salting strategy
Password List
- 283140d63e0937fb652ff7066bbf5c2f
- ba7c94b0431f30103c7eb5cdae180be6
- ff0e0cefdceb54618f47767d17b95a12
- ef98a984f8ab1341039f9f3344d80298
- 25e2262b5d8c95f7ece0bc4f30f5213d
A sensible salting approach
You should see that having one salt for everything is a bad idea, as it means we just need to regenerate the password list using this salt.
A more appropriate approach is to use a random salt for each of the passwords. This salt should be a decent length, to deter people compiling rainbow tables for each of the possible salt values.
Storing Salts
As we discussed, keeping the salts and passwords together is the usual practice (the salt is not for secrecy, but to deter brute force)
Lets say our salting strategy is to add a random string of 6 alphanums to end of the the plaintext
We can have a function to generate a salt
def genSalt():
"""
Generate a Salt
@param plaintext: plaintext password
@return A tuple of (saltedPw, salt)
"""
salt = []
for x in range(6):
salt.append(random.choice(string.ascii_letters))
return "".join(salt)
We can then create our salted plaintext ready for hashing, by appending this to the plaintext
saltedplaintext = "{0}{1}".format(plaintext, salt)
We also need a function to store the salt with the password. Lets stick with
the Linux strategy of storing them as <salt>$<password hash>
storedPw = "{0}${1}".format(salt, passwordhash)
TASK: Salted Passwords
- Write code that will generate a salted password
- Write code that will check a salted pw string, against a word supplied by the user
- Using the dictionary try to break the following hashes. (MD5_Salt in the repo)
-
How does this effect the performance of our hash cracker?
-
rkKxLR$c35bf00b953186ec3be4916dd45deabc
- MyEtoS$b0a56a1df2353c7629509a12a17f5a2d
- kshhHk$9664b09bedf4ed95f1b7b024087cec12
- FMVbrf$6751d1e9a8ee57b383836596869cd94a
- wKBUOm$981132067d1a4ef9e943b8c300071a55
Performance of different hashing algorithms
Some hashing algorithms (like bcrypt) are "slow", this means it takes much longer for them to generate the hash. This is beneficial when it comes to storing things like passwords, as it makes it much more computationally intensive to generate hashes from a given password list.
TASK: Comparing "Fast Hashing Algorithms
We alrady have code to crack a salted md5 hash.
Create a new function to crack salted SHA512 hash, and compare its performace
- kuxYZE$c4d82bdbd75562f3d6c6622dbf38663dd381665365cc54fa1f8a341bc705f93e6a4f9ef135765ed1cbf4d2ab115395b2439af4b0140b80e89551b5b02c3d4788
- gMpNRt$d5b35a1fdba16224c0588de40513968aefc36e0932df8b64fceac5314ab64fcbe71a0a91d34dd652d20eb679a790ea0a631bd4a4eaa4b6f4ea1bccd26159c48d
- VLzSQA$a5eaac2093c3178550c0469be27e55e0d8931ce0c38db7a23106cedf5f9f4008753e8e055d705a48c5ccf8cbddce7622b8e72d22b58f4048faf0750ba77e13e7
- wVsUgH$5057743ba865faa4fedb0aa59a9792cba3c7b308ba842d75e4974786af0f2d386a2cb6d93748aac0b277648d4bf4a3dcadb207eefb73d145b2531d695b0feb84
- xTGhNw$2eaceffd81c2cd6819afc3c91bcc6c493a2edd965682eec6894356bf3d1841c58623c0830ba4fc8d96864eb9f998236feb73985668b5badc996c33e5e586361c
Bcrypt is a "slow" hashing algorithm. Lets see how this compares
TASK: Bcrypt
Using the python bcrypt libary. Write:
- A function that Will salt and hash user input (NOTE: Bcrypt will take care of storing the salt for you)
- A function that will take a dictionary word and compare it to the hashed password
- A tool that uses these functions to brute force the following dictionary words (Bcrypt).
-
What's the difference in performance here?
- $2b$12$CQdfZxhjQH83jY.RZNfGPegUWSL05J5Amekp5pmjXkAgvtuBynZzy
- $2b$12$4YiyGckKJXQGKMkGwOqM5.JKTjJIMlwwdfHYHtLhfc38lFU9iZicK
- $2b$12$8vMax8thp0D7G2yns8gY7unUmdZiiJKJrjhW7pRtoo.X4YDObRVCK
- $2\b$12$JE6BgAfbDdLgWNnMuaqyeuvhmXECqsnZPMXqO3Rv.pxrzT8Zn4wqy
- $2b$12$YINaek/4qU8/k2egmjZxn.MYU7lVehNi8p94iEDFvM7MaS3BUlZKK
Non Dictionary Attacks
While we have looked at dictionary words so far (and its true that many passwords are dictionary words), we should also consider strong passwords, of the type a random password generator would give us.
TASK: Cracking Random Passwords
Code to get you started is in the random passwords program in the github.
This will generate random passwords of a given length using both SHA-256 and BCRYPT
-You will need to come up with a strategy for trying to brute force these passwords. - Crack passwords of different lengths. - How many attempts do we need to make for a password of a given length? - How long does a random password need to be before it becomes harder than a dictionary word - Does salting the password make any difference here? - How long does a random password have to be, before it take longer to crack than something in the top 1 million list.