An Introduction to Buffer Overflow #4: Overwriting the Stack

Gurkirat Singh reveals how to overwrite the stack with buffer overflows and uncovers low-level vulnerabilities, providing detailed debugging insights.

An Introduction to Buffer Overflow #4: Overwriting the Stack
The striking illustration above, bursting with vivid and eye-catching colors, depicts two women with colorful liquid streaming from their eyes. This mesmerizing piece is crafted by Andrew Archer, a skilled illustrator from Sydney, Australia.

Hello World! So you have now the basic understanding how the calling and returning from the stack works on the LOW level, or you can revise from here. Let's go a step further on overwriting the stack.

This post will explain how stack overflow occurs and how it affects the return address. I would recommend reading this post first, and then the next post where I will go over buffer overflow basics and show you how easy it is to overflow the buffer (of course, when canaries are disabled).

I will use the following text throughout this post to analyse, and overwrite the buffer in the program.

The easiest way to overwrite stack is to keep writing text, even though it doesn't make any sense like in this case. See I am still writing it.

Before we begin debugging, let me demonstrate how the overflow and overwriting of the return address will appear to a non-technical user who unintentionally triggered it while running the vulnerable program.

How does Overflow Look to a User?

So, in this program, I pasted the text that accepts input via the gets() function and then prints it on the screen with the printf() function. Both of these functions are defined in C++ standard library.

Demo of buffer flow in the normal executing program.

If you will convert the hex to character form and reverse it (little endian), you will see a sub-string tack. Does it ring bells? If not, copy the input text, paste it in a notepad, then look it up. The outcomes will be as follows.

Confirm the stack overwrite

Debugging the Function Calls

As usual, the main function is the starting point of the program, which calls the function1() function. There is nothing juicy going on in the main,  so we can just focus on the function1() here.

As you can see, the debugger has been useful in printing the function names alongside their call instructions, so let's add a break point on the gets function. I'm ignoring the printf function because it will only print the data. However, the gets will read the data from the stdin and save to the buffer allocated on stack.

Target function to analyze

Why do I think the Buffer is on Stack?

As I was studying, a question popped into my head. If you had the same question, give yourself a pat on the back. You're a genuine learner. Now, let's dive into the explanation. When you examine the instructions right before the initial CALL instruction within the function.

00401534  |. 8D45 E6        LEA EAX,DWORD PTR SS:[EBP-1A]

In the given context, an address is calculated by subtracting 0x1A bytes from the base pointer EBP (0x0022FE98), resulting in the address 0x0022FE7E and stored in the EAX register, which lies within the current stack area. The use of SS indicates that this memory access occurs within the stack segment. This address is then moved (passed) to the ESP in the next instruction.

00401537  |. 890424         MOV DWORD PTR SS:[ESP],EAX
It means that the buffer was not malloc'ed, but some array is created of fixed size.
Source address for gets() is stored in the stack

Notably, an earlier instruction allocates adequate space for the stack frame. This calculated address, 0x0022FE7E, falls between the stack pointers ESP and EBP, and a MOV instruction is employed to load the reference (DWORD PTR) at this address. This process involves manipulating memory within the stack segment for function execution.

00401523  |. 83EC 38        SUB ESP,38

When you input text into the gets() function, it will not begin exactly at address 0x0022FE60, but rather somewhere around line 0x0022FE7C. This means the actual input process starts a bit later in memory.

I hope you get the idea by now, let's continue debugging the function. I am currently at the point where the gets function is being called. Next, I will display the program's status when the EIP (instruction pointer) is indicating the LEAVE instruction.

State of program when its about to LEAVE and RET

When I pressed the debugger's Step Next button, the LEAVE operation returned the value 0x73206574 to the EBP in order to restore the caller function's base pointer, which had also been overwritten. The RET instruction was then called, which popped the overwritten value underneath popped EBP, which is tack and does not exist in memory or the process does not have access to. As a result, the programme crashes with a "Access Violation" error message.

Result of continuing on invalid EIP value

I can't easily copy and paste the addresses because they are in binary format (non-human readable). I'll do local exploitation later, but for the time being, I'll focus on remote exploitation. However, if you are aware of any techniques other than power-shell stdin write, python sub-process, and so on for writing binary data to process stdin, please let me know.

References

Segmentation fault - Wikipedia
std::gets - cppreference.com
What are the Purposes of Segment Registers in Intel 8086 Microprocessor « Online Class Notes
Code segment (CS) is a 16-bit register containing address of 64 KB segment with processor instructions. The processor uses CS segment for all accesses to instructions referenced by instruction pointer (IP) register. CS register cannot be changed directly. The CS register is automatically updated dur…
In assembly, what does `PTR` stand for?
and dword ptr [ebp-4], 0 In assembly code like above, what does the term PTR stand for?I know their usage -- size directives; but where had the term PTR been coined from? Does it stand for Po…
Guide to x86 Assembly
8086 Assembler Tutorial for Beginners (Part 9)
The Stack - Tutorial for Beginners
What is the difference between MOV and LEA in example?
Looking at an assembly code snippet I see: lea rax, [rbp-0x50] Which tells me that the rax register now points to whatever value is in rbp-hex(50). Question. Would I achieve the same result doin…
Memory Segmentation in 8086 Microprocessor - GeeksforGeeks
A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.
What does lea rdx,qword ptr ss:[rbp+50] mean?
What does the code below exactly mean? lea rdx,qword ptr ss:[rbp+50] I have difficulty in rbp+50. Is rbp the base pointer in the 64 bit CPU architecture? What is the base pointer refering to? how…
AccessViolationException Class (System)
The exception that is thrown when there is an attempt to read or write protected memory.