TECHNICAL Featured

An Introduction to Buffer Overflow #4: Overwriting the Stack

Gurkirat Singh reveals how to overwrite the stack with buffer overflows and uncovers low-level vulnerabilities, providing detailed debugging insights.

Gurkirat Singh

Oct 18, 2023 • 6 min read

The striking illustration above, bursting with vivid and eye-catching colors, depicts two women with colorful liquid streaming from their eyes. This mesmerizing piece is crafted by Andrew Archer, a skilled illustrator from Sydney, Australia.

Hello World! So you have now the basic understanding how the calling and returning from the stack works on the LOW level, or you can revise from here. Let's go a step further on overwriting the stack.

This post will explain how stack overflow occurs and how it affects the return address. I would recommend reading this post first, and then the next post where I will go over buffer overflow basics and show you how easy it is to overflow the buffer (of course, when canaries are disabled).

I will use the following text throughout this post to analyse, and overwrite the buffer in the program.

The easiest way to overwrite stack is to keep writing text, even though it doesn't make any sense like in this case. See I am still writing it.

Before we begin debugging, let me demonstrate how the overflow and overwriting of the return address will appear to a non-technical user who unintentionally triggered it while running the vulnerable program.

How does Overflow Look to a User?

So, in this program, I pasted the text that accepts input via the gets() function and then prints it on the screen with the printf() function. Both of these functions are defined in C++ standard library.

Demo of buffer flow in the normal executing program.

If you will convert the hex to character form and reverse it (little endian), you will see a sub-string tack. Does it ring bells? If not, copy the input text, paste it in a notepad, then look it up. The outcomes will be as follows.

Debugging the Function Calls

As usual, the main function is the starting point of the program, which calls the function1() function. There is nothing juicy going on in the main, so we can just focus on the function1() here.

As you can see, the debugger has been useful in printing the function names alongside their call instructions, so let's add a break point on the gets function. I'm ignoring the printf function because it will only print the data. However, the gets will read the data from the stdin and save to the buffer allocated on stack.

Why do I think the Buffer is on Stack?

As I was studying, a question popped into my head. If you had the same question, give yourself a pat on the back. You're a genuine learner. Now, let's dive into the explanation. When you examine the instructions right before the initial CALL instruction within the function.

00401534  |. 8D45 E6        LEA EAX,DWORD PTR SS:[EBP-1A]

In the given context, an address is calculated by subtracting 0x1A bytes from the base pointer EBP (0x0022FE98), resulting in the address 0x0022FE7E and stored in the EAX register, which lies within the current stack area. The use of SS indicates that this memory access occurs within the stack segment. This address is then moved (passed) to the ESP in the next instruction.

00401537  |. 890424         MOV DWORD PTR SS:[ESP],EAX

It means that the buffer was not malloc'ed, but some array is created of fixed size.

Source address for `gets()` is stored in the stack

Notably, an earlier instruction allocates adequate space for the stack frame. This calculated address, 0x0022FE7E, falls between the stack pointers ESP and EBP, and a MOV instruction is employed to load the reference (DWORD PTR) at this address. This process involves manipulating memory within the stack segment for function execution.

00401523  |. 83EC 38        SUB ESP,38

When you input text into the gets() function, it will not begin exactly at address 0x0022FE60, but rather somewhere around line 0x0022FE7C. This means the actual input process starts a bit later in memory.

I hope you get the idea by now, let's continue debugging the function. I am currently at the point where the gets function is being called. Next, I will display the program's status when the EIP (instruction pointer) is indicating the LEAVE instruction.

State of program when its about to `LEAVE` and `RET`

When I pressed the debugger's Step Next button, the LEAVE operation returned the value 0x73206574 to the EBP in order to restore the caller function's base pointer, which had also been overwritten. The RET instruction was then called, which popped the overwritten value underneath popped EBP, which is tack and does not exist in memory or the process does not have access to. As a result, the programme crashes with a "Access Violation" error message.

Result of continuing on invalid `EIP` value

I can't easily copy and paste the addresses because they are in binary format (non-human readable). I'll do local exploitation later, but for the time being, I'll focus on remote exploitation. However, if you are aware of any techniques other than power-shell stdin write, python sub-process, and so on for writing binary data to process stdin, please let me know.