Writing your own

At some point you'll have to write your own shellcode, because generated ones are too large, or you need specific instructions to exploit the binary.

We'll start by analyzing the shellcode generated with shellcraft.sh().

Shellcraft.sh

Generate a shellcode with shellcraft.sh(), and focus on the highlighted line.

Python 3.11.1 (main, Dec 31 2022, 10:23:59) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pwn import *
>>> print(shellcraft.sh())
/* execve(path='/bin///sh', argv=['sh'], envp=0) */
    /* push b'/bin///sh\x00' */
    push 0x68
    push 0x732f2f2f
    push 0x6e69622f
    mov ebx, esp
    /* push argument array ['sh\x00'] */
    /* push 'sh\x00\x00' */
    push 0x1010101
    xor dword ptr [esp], 0x1016972
    xor ecx, ecx
    push ecx /* null terminate */
    push 4
    pop ecx
    add ecx, esp
    push ecx /* 'sh\x00' */
    mov ecx, esp
    xor edx, edx
    /* call execve() */
    push SYS_execve /* 0xb */
    pop eax
    int 0x80

Thankfully, there are some comments within the generated shellcode.

For now, let's just focus on the first comment. The main goal of this shellcode, as stated by the first comment on line 5, is to call the function execve(path=/bin/sh, argv=['sh'], envp=0).

If you check man execve, you'll see that execve is used to execute a program, with the following parameters:

path to the program to execute
array of pointers to strings, passed as arguments to the executed program
array of pointers to strings, passed as environment variables of the new program

With the current arguments, execution of the function would start a new shell /bin/sh, with the first argument being sh (filename of the executable)

This is the convention when starting a new program, the first argument is the filename of the executable.

But since we :

may not have access to execve()'s address in the program
may not have access to tools to execute arbitrary functions anyway

we can just use a syscall instead, because in the end, these functions are just wrappers for syscalls.

Syscalls

Syscall, short for system call, is a function that is executed by the kernel.

If you have ever written a print hello world in asm, you should know how they work.

Basically, you fill arguments into some registers, you put the syscall number into eax, then you transfer the flow to the kernel by calling int 0x80.

This would look like :

section .text
    global _start

section .data
msg db  'Hello, world!',0xa     ; our string
len equ $ - msg                 ;length of our string

section .text
; linker puts the entry point here:
_start:

; Write the string to stdout:
    mov edx,len ;message length
    mov ecx,msg ;message to write
    mov ebx,1   ;file descriptor (stdout)
    mov eax,4   ;system call number (sys_write)
    int 0x80    ;call kernel

How do you know which register is used for which parameter ? By refering to a syscall table.

https://chromium.googlesource.com/chromiumos/docs/+/master/constants/syscalls.md#x86-32_bit

For sys_write, it's :

NR	syscall_name	eax	arg0 (ebx)	arg1 (ecx)	arg2 (edx)	arg3 (esi)	arg4 (edi)
4	write	0x04	unsigned int fd	const char *buf	size_t count

Which matches the signature of the man 2 write.

question

Could you fill these tables for execve() ?

NR	syscall_name	eax	arg0 (ebx)	arg1 (ecx)	arg2 (edx)	arg3 (esi)	arg4 (edi)
11	execve	?	?	?	?

argument	register
filename	?
argv	?
envp	?

Answer

If you check the NR 11 on the website, you'll see:

NR	syscall_name	eax	arg0 (ebx)	arg1 (ecx)	arg2 (edx)	arg3 (esi)	arg4 (edi)
11	execve	0x0b	const char *filename	const char const argv	const char const envp

argument	register
filename	ebx
argv	ecx
envp	edx

Your shellcode

Let's write a shellcode that opens a shell. As said earlier, argv[0] (in the second argument of execve()) should be equal to the filename of the binary.

However, /bin/sh is special, and does not check argv[0], so we can just call execve('/bin/sh', 0, 0). Reason why here

Steps :

push the string '/bin/sh' on the stack
save the address of the string in a register
set the other arguments (argv and envp) as 0, using the registers from the syscall table
set the syscall number from the syscall table
finish by calling int

You can check that your shellcode works using the exercise leak.

caution

Try to avoid using null bytes, as it can terminate the string in some inputs. You can use hexdump to print the hexadecimal dump of your shellcode to check for null bytes.

shellcode = shellcraft.sh()
>>> print(hexdump(asm(shellcode)))
00000000  6a 68 68 2f  2f 2f 73 68  2f 62 69 6e  89 e3 68 01  │jhh/│//sh│/bin│··h·│
00000010  01 01 01 81  34 24 72 69  01 01 31 c9  51 6a 04 59  │····│4$ri│··1·│Qj·Y│
00000020  01 e1 51 89  e1 31 d2 6a  0b 58 cd 80               │··Q·│·1·j│·X··│
0000002c

Hints 🗺

Hint 1

The string must be pushed in little-endian. Check this page if you forgot.

Hint 2

How do you set a register to 0, without using mov reg, 0 ?

Hint 3

How to retrieve the address of the string, which is at the top of the stack ?

If you got your shell, congratz ! 🎉 Otherwhise, keep trying, keep debugging !

Writing your own

Shellcraft.sh​

Syscalls​

Your shellcode​

Shellcraft.sh

Syscalls

Your shellcode