Coventry University Logo
4061CEM - Programming and Algorithms 1
  • Project: Brute Force
    4

Project: Brute Force

This project aims for you to develop a tool that will gain access to a collection of executable binaries that hold secret information. This information is protected by a password and your task is to build a tool that can 'break in' by attempting to use multiple passwords.

Setting up the Project

To begin this project, you will need to clone the repository, which is accessible at the following URL:

https://github.coventry.ac.uk/CUEH/4061CEM_BruteForce

You should create a fork of the repository first, and then clone it. This will create a local copy which enables you to make edits.

Setting up a Virtual Environment

As Python is a popular programming language, there are a lot of libraries/modules available to help you achieve certain functionality with minimal effort. However, when you write code that rely upon these libraries/modules there may be instances where they do not exist on the target platform, or they are incompatible with other libraries/modules on the target system.

In this instance, a virtual environment is used. It is not like a virtual machine, a virtual environment is a directory that consists of a copy of the Python interpreter and libraries; and they are localised for that project only.

For this project, you will use a virtual environment. In order to enable this, you will need to create the new environment using the following command in the root directory of the repository 4061CEM_BruteForce:

$ python3 -m venv brute_venv

This action is only performed once, so there is no need to recall this action each time the project is worked upon. The only instance in which you would repeat this process is when the development machine has been changed.

To use the virtual environment, you need to activate it from within the root directory of the project using the following command:

$ . ./brute_venv/bin/activate

There is an extra period (.) at the start, this tells the shell to import the environment variables from within the file brute_venv/bin/activate/ into the current environment. This action may need to be repeated each time you begin working on the project.

Installing the Libraries/Modules

When it comes to using the particular libraries/modules required for this project, you can install them using the following command:

(brute_venv) $ pip install -r requirements.txt

or

(brute_venv) $ python3 -m pip install -r requirements.txt

Before the dollar symbol ($) is (brute_venv); this denotes that you are working inside the virtual environment.

pip is the Python package installer, and it will read the contents of requirements.txt and downloads/installs the libraries/modules that are listed within it. This action is only performed once, so there is no need to recall this action each time the project is worked upon. The only instance in which you would repeat this process is when the development machine has been changed, or if you have added libraries/modules to the file.

Automating the Environment Set-up

Included as part of the project is a Makefile. This file consists of functions that the make command can read and execute in the terminal. This file will not be discussed in-depth, but it enables you to set up the virtual environment and install requirements with ease by using two commands:

$ make venv

This will create the virtual environment, and once activated you can install the pre-requisite libraries and modules by using:

$ make prereqs

Beginner Task

Your first task will be to look at the code, and identify what each piece of the code does. Once you have gained some familiarity with the project, you are required to create a tool to break into all the binaries that are in the 'basic set'.

The binaries for this project can be found in the targets folder of the project repository. The tool should be able to break in to all the binaries that are consisted within the basic set. The passwords within this set are made up of digits between the value of zero and nine and have a maximum length of three.

How many Passwords?

Image that each password is between one and six characters long. For each length \(l\), the number of passwords that are possible for that length is equal to \(26^l\).

Example

Consider that you have a single-character password, it must be one of the \(26\) letters that are in the alphabet (\(26^l\)). If you have two characters, then you have \(aa\) to \(zz\). From this, you can see that every letter is combined with every other letter; this is equal to: \(26 * 26 = 26^2\)

For three characters, it is \(26\) times the total for the two characters; because for each letter that is in the first character spot you can put every two-character possible password after it. Therefore, this is equal to: \(26 * 26 * 26 = 26^3\).

For all the passwords, you need the total for six characters, plus the total for seven characters and so forth. This would be \(\sum\nolimits_{6}^1 {26^l}\). The actual number can be calculated in Python using the following code:

# Values for the shortest and longest passwords
minLength = 1
maxLength = 6

# The number of characters in the alphabet
numCharacters = 26

# An accumulator to store our calculations to
total = 0

# maxLength+1 is used as the range will run up-to maxLength but not 
# including. 
for l in range(minLength, maxLength+1):
  possible = numCharacters ** l
  print(f"Length {l} and Characters {numCharacters}: {possible} possible passwords.")
  total += possible

print(f"\nTotal Possible Passwords: {total}")

The output of this code will give is the following:

Length 1 and Charaters 26: 26 possible passwords.
Length 2 and Charaters 26: 676 possible passwords.
Length 3 and Charaters 26: 17576 possible passwords.
Length 4 and Charaters 26: 456976 possible passwords.
Length 5 and Charaters 26: 11881376 possible passwords.
Length 6 and Charaters 26: 308915776 possible passwords.

Total Possible Passwords: 321272406

The total number of possible password is \(321,272,406\) that is just over \(320\) million. Therefore, this is not a small task; but within the realm of possibility. If you can manage \(100\) tries per second, then you will need around three million seconds (or \(37\) days) to brute-force the passwords.

Intermediate Task

Generally, most password are not simple as they can be any length and contain a mixture of letters, cases, numbers and punctuation. Therefore, using a brute-force approach on this problem would be difficult because of the number of possible passwords.

Using the same logic as expressed in the beginner task, and assuming each character can be an uppercase or lowercase character, digit or one of ten punctuation marks; then the number of possible eight-character passwords is: \((26 + 26 + 10 + 10)^8\) or \(72^8\) which is \(722,204,136,308,736\); that is seven hundred and twenty-two trillion, two hundred and four billion, one hundred and thirty-six million, three hundred and eight thousand, seven hundred and thirty-six possible passwords. If the tool in this project was to deal with arbitrary-length passwords, then you would need to reduce this number. Luckily, many passwords used are made up of dictionary words with modifications. For example, the password could be: "swordfish", sw0rdf1sh" or "swordfish1" etc.

For this task, you will need to break into the intermediate binaries. You will need to extend upon the tool to try such corruptions. Some of these binaries will require just the dictionary word, and others will have a single-digit number appended to the end (i.e. "swordfish1"). Some of these passwords will be in "leet speak", whereby a character is replaced by a number, shown in the table below.

Character Number
O or o \(0\)
I or i \(1\)
E or e \(3\)
A or a \(4\)
S or s \(5\)

For example, the word "dishwater" would become "d15hw4t3r". There are other variants of this that would replace more characters, but for this task you are only concerned with those shown in the table above.

The base dictionary for this task is provided in the dictionaries/base.txt file in the 4061CEM_BruteForce root directory.

Advanced Task

For this task, you will need to considered passwords of any length, with (almost) all the ASCII characters as valid symbols. The 'basic' challenge can be completed by attempting all combination of characters, but this would be time-consuming for this advanced task.

The target binary has a quirk that can speed things up. When a password is checked, it will compare the guess with the real password one character at a time. If the first characters do not match, it will reject the password immediately. If the first characters do match, then the binary will check the next pair of characters and so on. This would mean that for each correct character a check will take longer; however, it is a small difference. If you try the possibilities enough times and average out how long they take, then you can infer which character is correct.