Introduction

First lets have a look at what an overflow is, and how it occurs.

Demonstrating and Overflow

A buffer overflow occurs when we try to put too much data in a piece of memory. This is usually due to programmer error, with the developer breaking Rule #1 and trusting the users input.

Programming languages like C and C++ require us to specify the size of our data structures. You may be familiar with the following line of code to create a 10 Character String.

Note

Yes, I know, technically it's a character array, but the semantics here do not matter

char theString[15];

What we are doing here, is telling the program to create a variable to hold a block of data, up to 15 characters long. Space for this variable is allocated in memory, on the stack, so the program can store the value.

The problem occurs if we try to add more than 10 characters to the variable, C/C++ do not check the size of the array when we copy data into it (they expect the programmer to know what they are doing) which means the part of the stack following the variable gets corrupted.

Lets expand our example, and try to assign more characters to the array than it can take.

#include <stdio.h>
#include <string.h>

void main(void){
  char theString[15];
  //Copy a String that is longer than the space allocated
  strcpy(theString, "Hello World, This Is A Long String");
  printf("%s", theString);
}

When we compile, the compiler detects this and we get some warnings about overflowing the destination

$gcc OverflowExample.c
OverflowExample.c: In function ‘main’:
OverflowExample.c:7:3: warning: ‘__builtin_memcpy’ writing 34 bytes into a region of size 15 overflows the destination [-Wstringop-overflow=]
   strcpy(theString, "Hello World"); //12 Characters
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Running the code now gives us a Segmentation Fault (seg fault), this is the programs way of telling us we have tried to access a bit of memory outside of the current stack frame. If you have done any C style programming that requires looping through arrays, you will probably have seen this before, when you try to access an item outside the array.

$./a.out
[2]    23886 segmentation fault (core dumped)  ./a.out

Looping through a Buffer

Lets look at a second example of a segmentation fault. This may be more familiar from your expereinces of dealing with arrays in C/C++

int main(){

  int array[10];  //Array of 10 numbers

  //Loop past the end of the array
  for (int i=0; i<50; i++){
    array[i] = i;
  }
}

In this code we try to allocate values to an array. However, our logic is incorrect and we try to allocate outside of the array bounds. (We go attempt to access item 11 of a 10 element array)

This time our compiler gives no warnings about the problem, but when we run the code we get the same segmentation fault error, when the code attempts to access memory outside of that currently allocated.

How Program Memory Is Organised

So the is the problem we have is walking off the end of allocated memory. Part of the solution is understanding how memory is organised.

Note

Yes, this is fundemental computer science stuff.... Bear with it, it will make the rest a lot easier.

The Stack

Having seen what happens when we try to access unallocated memory, lets examine how the OS allocates memory.

THe stack is an area of computer memory designed to hold temporary variables created by functions. In programming terms, the stack is a simple data structure used to store information. We can place (push) items on the stack, and remove (pop) the top item.

The stack starts at the "top" of memory (the low memory addresses), and grows downwards, with the current top of the stack stored in a register called the stack pointer.

When a function is called, its parameters are added to the stack, and the stack pointer updated to represent the new top of stack value. When a function returns, data is not actually removed, but the stack pointer is updated to represent the new top of stack.

Example of a Stack

The Basic operation of a stack can be seen in the video below

The Heap:

The stack is not the only area of memory where we can store data. The heap is also used to store information. The heap has some advantages when storing data, is it can deal with larger data objects, and provides a global scope.

However, unlike the stack, data on the heap needs to be manually allocated and destroyed which adds to the complexity of managing the information (for example memory leaks).The heap can also be vulnerable to attack.

Registers

To manage the stack we need some way of keeping track of where we currently are. To do this we use registers. These are small chunks of memory (generally implemented in hardware) that act as pointers to addresses in our main memory.

In the x86 architecture the General Purpose Registers are used to store data and addresses related to the current operation.

Registers used for general operations, such as passing data to function calls.

AX
BX
CX
DX
R8-R15 (General Registers for 64 Bit Processors)

Registers used for Stream (or string) operations

SI (Source Index) Pointer to start of data input location
DI (Destination Index) Pointer to data output location

Registers used for Managing the Stack

BP: (Base pointer) Points to the current base of the stack
SP: (Stack Pointer) Points to the current top of the stack

We also have The Instruction Pointer (IP). This tells the processor the address of the next instruction to be executed.

Important

The architecture will add a prefix to the register we are addressing, to let us know how large it is.

32bit E (Extended)
64bit R (Register)

Thus: - in a 32Bit system our Instruction pointer becomes EIP - in a 64Bit system our Instruction pointer becomes RIP

x86 Registers

Function Calls and the Stack

The stack is an efficient structure for dealing with function variables (and removes the need for memory management and garbage collection required for the heap).

Next we are going to examine how function calls make use of the stack.

How Stacks help program flow

This is a really cool solution to a major problem. Without being able to dynamically allocate function calls on the stack, computing would be a very different place.

The stack allows us make function calls at "runtime", by pushing them onto the stack, then remove them after the function has finished. Without taking this approach we would need to dynamically allocate things in a similar way to the heap. While this has a significant downside in extra overhead and management, it is infinitely better than the other approach, of pre-calculating every state a program could be in, and allocating the memory.

When a function is called, the data associated with it is stored in something called a stack frame, this holds any information to do with the current function.

First any arguments are pushed onto the stack.

NOTE: There is a subtle difference beween 32 and 64 bit architectures here:
- A 32 Bit system will push the arguments via the stack
- In 64 Bit systems registers are used to pass the first 6 arguments.
The address of the Current instruction (the Instruction Pointer) is pushed

This acts as a Return Address and tells the program what address it needs to go back to once the function finishes.
The Current Base pointer is also pushed
Finally the space required for any local Variables for the function is calculated and allocated.

As well as adding the data, there is some extra housekeeping that occurs to allow the registers to keep track of the current state of the stack. The Base Pointer register (BP), is updated to hold the updated Base pointer that was pushed earlier. and the Stack Pointer (SP) register are updated to the new top of stack

The Image Below has an Example of a 32 Bit Stack Frame, in both "Tall" and "Wide" formats

Note

The tall format is closer to what the stack frame actually looks like

However, I find its more intruitive to transpose to the Wide format for the text. (It also works better on slides).

Therefore I will use the wide version so this material matches with the lecture notes.

Tall

Stack Frame

Wide

Transposed Stack Frame

Worked Example

Info

There is a more detailed explaination of this whrere we go through all the assembly instructions.

Lets look at how a stack frame grows with a worked Example. We can use the following code, that simply adds two numbers.

int add(int var1, int var2){
    //Add two numbers
    int total;
    total = var1+var2;
  return total;
    }

void main(int argv, char* argc){
    //Function call
    int total = add(10, 20);
}

Walkthrough: Main Function

First Lets Examine How our stack will call the Main function.

Arguments get pushed onto the stack in reverse order (If 32Bit System)

In this case our main function has two sets of arguments. int argc and char* argv These get pushed onto the stack in reverse order (so the first to be pushed is argc)

Note

Rememebr that 64Bit systems handle things slightly differently, and the first 6 arguments are handled by registers.
Current Instruction Pointer gets pushed onto the Stack

This acts as a return address for the when the function completes. Meaning we can POP the address off the stack into the Instruction Pointer and continue executing the code from where we left off.
Base Pointer Register gets pushed onto the stack

The Base pointer represents the bottom of the current stack frame, and is used for calculating variable addresses (and to help us get back to the return address when the function finishes). Here we simply Push the current SP register value onto the stack, and Update the BP register accordingly.
Any Space for local Variables is allocated

Finally, our program calculates the space requried for any local variables and increases the stack pointer by the relevant amount.

Important

Note that we only increase the pointer value (move the top of the stack) This means that any space allocated for variables is unititialised untill we actaully put data in there. Usually, its whatever random values were on the stack previously, but sometimes it might appear to make sense. Bear this in mind if you end up examining data.

Following the initialisation the Stack looks something like this:

Main Function Initialised

Calling the Calc Function

The next function call is to the Calc Function. Its the same process as with main.

Arguments to the function are pushed onto the stack

calc(int num1, int num2) has two arguments, these are pushed onto the stack in reverse order
Current Instruction Pointer is Pushed to act as return address
Base Pointer is Updated
Space for Local Variables is allocated

Again we only have one local variable, total so we move the stack pointer forwared by 2 bytes

Calc Function Stack

When the Calc Function Completes

When a function completes, its is removed from the stack (In reality we don't actually remove everything, but update the various pointers to the correct values)

Lets look at what happens when Calc Completes.

Update the Base Pointer

We POP the base saved based address off the stack, and update the base pointer register. This updates the base of the stack frame to be that of the previous function
Update the Instruction Pointer

We POP the saved return address off of the stack, and update the Intruction Pointer register. This moves the Instruction pointer back to where the function call occured.
The calling function cleans up the arguments

We stack pointer moved to remove the calling arguments from the stack

Updating the Instruction Pointer

Here we can see the Important part of the whole process when it comes to overflows. When we POP the return address to update the instruction pointer, it gives us the potential to control program flow. If we can control the address that is in that part of the stack. We can change the bahabilur of the code

A Video On Stack Frames

While stack frames give us a lot of flexibility in how programs work, the way the stack frames are crated is one part of the problem when it comes to overflows.

You can see how stack frames are created in this Video

Coding issues that lead to overflows

The overflow occurs when we try to write too much data to a variable. This will usually occur when we have to copy strings between location in memory, and is usually due to programmer error.

Lets consider the following code that "echo's" data back to the user.

void main(int argc, char* argv[]){

    //Define Variables
    char buffer[20]; 

    //Sanity check for command line arguments
     if (argc != 2){ 
         printf("Argument Required\n"); 
         }

    //Store input in the Buffer and print
    memcpy(buffer, argv[1], strlen(argv[1])); 
    printf("You Entered %s\n",buffer);

The faulty code is the memcpy line, while we use a "bounded" function call, the coder has set the number of bytes to copy as the length of the input, rather than the buffer. We can therefore give data of any length, and it will be copied across.

Info

Unfortunetly, several C and C++ methods dont place any bounds on the data being copied across, things like strcpy or gets are examples of this. Often the compiler will try to protect us by showing a warning message, but who worries about compiler warnings?

A Note on Protection

OS and Complier designers have known about overflows for a long time, there are several built in methods that are used to mitigate them. While these techniques can offer a decent level of protection for remote systems, we can often work around them if we have local access.

Stack Smashing Protection. Sets values called "canaries" on the stack, these are monitored as the program runs and if their value changes, the program execution stops. It is a pain to work around and it is now the default option when compiling with GCC. However, there are many older libraries without this protection activated.
No Exec Stack. Is a compiler level protection that stops data in the stack from being executed, This makes it much harder when inserting payloads into our overflows, as they will not run directly. Again we will look at defeating the NoExec in a later article
ASLR. Address Space Layout Randomization, is an OS based protection technique. When a program is run, its starting address on the stack is Randomized, meaning it is harder to guess addresses in the overflow. ASLR does a reasonably good job of protecting from the casual hacker, we will examine some techniques to working around it later in the course.
PIE Position Independent Execution. These binaries are compiled using Offsets for code location. The actual locations of the code instructions are then calculated at run time. This makes it hard to find instructions to call, as we cannot be sure of their location.

Summary

In this article we have had a brief introduction to buffer overflows, and looked at the memory management of function calls in running programs. In the next article we will see the stack frame can be exploited to modify program flow.

Introduction

Demonstrating and Overflow

How Program Memory Is Organised

The Stack

Registers

Function Calls and the Stack

Worked Example

Walkthrough: Main Function

Calling the Calc Function

When the Calc Function Completes

A Video On Stack Frames

Coding issues that lead to overflows

A Note on Protection

Summary

Links