In Detail Worked Example
Here we go through the worked example in Detail.
We use lots of Diagrams to help explain what is going on, hopefully it can make things clearer.
Example Code
int add(int var1, int var2){
//Add two numbers
int total;
total = var1+var2;
return total;
}
void main(int argv, char* argc){
//Function call
int total = add(10, 20);
}
Lets take the code apart in a disassembler to see what the instructions to the processor look like...
If we run it through a Disassembler our Main function looks like this
0x000011a9 <+0>: push ebp
0x000011aa <+1>: mov ebp,esp
0x000011ac <+3>: sub esp,0x10
0x000011af <+6>: call 0x11cb <__x86.get_pc_thunk.ax>
0x000011b4 <+11>: add eax,0x2e4c
0x000011b9 <+16>: push 0x14
0x000011bb <+18>: push 0xa
0x000011bd <+20>: call 0x1189 <add>
0x000011c2 <+25>: add esp,0x8
0x000011c5 <+28>: mov DWORD PTR [ebp-0x4],eax
0x000011c8 <+31>: nop
0x000011c9 <+32>: leave
0x000011ca <+33>: ret
The first 5 lines our the Prologue for the main function. This is where we setup its stack frame.
Push the Arguments for Main onto the Stack
NOTE: This takes place outside of the main function, In the programs Entrypoint.
To start building our stack frame we push the arguments for the function
(arg c, arg v)
onto the stack in reverse order.
We also push the Return address for our entrypoint function onto the Stack
Remember that the Stack pointer is also automatically updated by the CPU when we push each item.
Update the Pointers for our Stack Frame
We next update our registers to hold details of the current stack frame
0x000011a9 <+0>: push ebp
First save the Current Base pointer by pushing it onto the stack. This means that when the current function returns, we can reset the stack to the state it was in. (IE we can move the BP register to the correct location)
There is also some housekeeping with the registers here.
0x000011aa <+1>: mov ebp,esp
We now need to make sure that our base pointer register refers to the base of the current stack frame. This is important because our variable calculations are based of the current base pointer.
To do this we simply set the base pointer, to be the same value as the current stack pointer
Allocate Space for Variables
Our next step is to allocate space for the Variables
0x000011ac <+3>: sub esp,0x10
When we compile the program, the compiler examines any local variables for the current function, and allocates space for them on the stack. This is done by calculating the amount of space required for the variables and subtracting it from the current stack pointer.
You gotta subtract to add
Its a bit counter intrutive, we add space to the stack by subtracting. However, If rememebr that the stack grows downwards from the High Addressess this makes sense.
In our code the main function has one local variable total.
int total = add(10, 20);
TODO: Recompie with optimisation Off, and this will make more sense......
Its an integer value (which is 4 Bytes). However we subtract 16 from the stack, this is due to compiler optimisations. However, space is allocated Therefore we subtract this from the current stack pointer register.
Note
You notice we dont push anything onto the stack here, just change the Stack pointer so the correct amount of space is allocated, meaning that the memory is "uninitialised"
So what's in the memory that we have just allocated space for? The simple answer is "whatever was there before", this is usually random junk, but could pretend to have some meaning.
Once this is done, we have taken care of creating the stack frame for our main function.
Get PC Thunk Ax???
The next lines of our decompiled program have nothing to do with our stack frame. (We can safely ignore it here)
0x000011af <+6>: call 0x11cb <__x86.get_pc_thunk.ax>
0x000011b4 <+11>: add eax,0x2e4c
What's happening here is the code has been compiled for PIE (Position Independent). This means that none of the addresses are hardcoded, instead they are offsets from the start of the code.
This function calculates the offsets needed so that the program can access addresses in memory.
Getting Ready to call our next function.
The next lines of code are where we get ready to call the Add function
0x000011b9 <+16>: push 0x14
0x000011bb <+18>: push 0xa
0x000011bd <+20>: call 0x1189 <add>
This isn't too complex, first our function parameters are pushed onto the stack in reverse order. (Remember this is 32Bit, in a 64Bit system we would use registers instead)
- push 0x14 (20) onto the stack
- push 0xa (10)onto the stack
Then the add function is called.
This means our stack now looks like this
The Add Function
The instruction call
gef⤠disass add
Dump of assembler code for function add:
0x00001189 <+0>: push ebp
0x0000118a <+1>: mov ebp,esp
0x0000118c <+3>: sub esp,0x10
0x0000118f <+6>: call 0x11cb <__x86.get_pc_thunk.ax>
0x00001194 <+11>: add eax,0x2e6c
0x00001199 <+16>: mov edx,DWORD PTR [ebp+0x8]
0x0000119c <+19>: mov eax,DWORD PTR [ebp+0xc]
0x0000119f <+22>: add eax,edx
0x000011a1 <+24>: mov DWORD PTR [ebp-0x4],eax
0x000011a4 <+27>: mov eax,DWORD PTR [ebp-0x4]
0x000011a7 <+30>: leave
0x000011a8 <+31>: ret
We have seen quite a lot of this before, so we can skip the more detailed explanation
Add Function Preamble
0x00001189 <+0>: push ebp
0x0000118a <+1>: mov ebp,esp
0x0000118c <+3>: sub esp,0x10
0x0000118f <+6>: call 0x11cb <__x86.get_pc_thunk.ax>
0x00001194 <+11>: add eax,0x2e6c
As before we
- Push the current value of the Base pointer onto the Stack
- Do the housekeeping to update the Base pointer to the base of the current stack frame
- Allocate Space on the stack for local variables (
int total;
) - Do the calculations required for PIE binaries
Our stack now looks like this
Running the Code
Nearly there, we have done a lot of pushing data to the stack, but now we are finally ready to do the calculations.
First we want to load the function arguments (add(10,20)
) into some
registers so we can use them.
0x00001199 <+16>: mov edx,DWORD PTR [ebp+0x8]
0x0000119c <+19>: mov eax,DWORD PTR [ebp+0xc]
0x0000119f <+22>: add eax,edx
This also needs a bit of thinking about, the instruction
0x00001199 <+16>: mov edx,DWORD PTR [ebp+0x8]
0x0000119c <+19>: mov eax,DWORD PTR [ebp+0xc]
Is telling the processor to "Load the (Double Word) Value that is 0x8 bytes above the base pointer into the EDX register" Remember that the stack grows down towards the lower addresses. So this is the address of our first Argument. (Which is Below) the Base pointer in the stack. The same thing happens with our second argument (EBP+0xc), which is stored in the EAX Register
Note
With our 64bit architecure this is usually much easier. Remember that the first 6 function arguments are passed using registers. So we can genrally aviod the whole push -> Retrieve cycle.
Once we have out values stored in the Registers, we can add them together, and store in EAX
0x0000119f <+22>: add eax,edx
The Next set of lines deal with storing the function and preparing to return.
0x000011a1 <+24>: mov DWORD PTR [ebp-0x4],eax
0x000011a4 <+27>: mov eax,DWORD PTR [ebp-0x4]
-
We store the value we currently have in the EAX register (which is the total for our calculation in the space allocated for it
Again we have this addressing offset from the Base pointer. This time, we move Up the stack to the address 4 Bytes above.
-
We then Immediately take the value back out again and store it back in EAX ready for the leave call
Note
You may have noticed this step is redundant, as we are essentially shuffling the same value to and from a register
- We calculate the Sum of our two numbers and store in EAX
- WE put the value of EAX in RBP-0x4
- We take the value of RPB-0x4 and store in in EAX
I assume this is becaue its boiler plate for the opperation and the compiler doens't know if we are planning on using that value again. (so we colud have more commands after the Add)
Howver, most compilers are pretty smart about how they optimise things, so lets take it as a bit of weirdness
Leaving the Function
Once we have done the calculations we are ready to return the stored Value. We have already put it in the relevant register, (AX) in the step above.
0x000011a7 <+30>: leave
0x000011a8 <+31>: ret
So what happens when the leave operation is called. WE need to restore the stack to the state it was in Before our function was called.
- We can POP the value of the Stored Base pointer from the stack, and update our BP register
- We then POP the value of the Return Address from the stack and update our Instruction Pointer. This is the Important Bit, as it gives us our opportunity to control the IP, and thus program flow.
Completing the Main Function
Finally, lets get back to what remains of our Main Function. You have seen all this before also, so we can go a bit quicker.
0x000011c2 <+25>: add esp,0x8
0x000011c5 <+28>: mov DWORD PTR [ebp-0x4],eax
0x000011c8 <+31>: nop
0x000011c9 <+32>: leave
0x000011ca <+33>: ret
- We Modify the Version of the stack pointer to take account of the function call being completed. This is removing the function arguments (10 and 20) the Stack
- We Move the contents of the EAX register (that holds our return value) into the Slot we allocated for it (EBX-0x4)
- The Main function is also complete so we leave.