Tuesday, April 28, 2020

PlaidCTF 2020: Back to the Future

Back to the Future, pwn 400 (valis)

While you can find all our write-ups from PlaidCTF 2020 in the previous post, we decided this one deserves it's own spotlight.

Ladies and gentlemen, I present to you valis's write-up for the Back to the Future task from PlaidCTF 2020.

I'm actually old enough to have used Netscape (with modem) back in the day, so this challenge was right up my alley.
The first step was to try to find sources on the Internet. It wasn't easy, but eventually I've found this github repo:

Those sources were for version 4.x, and our target was 0.96 Beta. 4.x is a huge application with html editor, mail client, JavaScript support, etc., while 0.96 barely supports basic HTML with images, so it was far from ideal, but it was still helpful, especially some core libraries were pretty similar on both versions.

Next step was try start reversing and I've soon encountered another issue:
./back_to_the_future: Linux/i386 demand-paged executable (ZMAGIC), stripped

The binary was in an ancient a.out format (zmagic variant).
I've spent some time trying to get it to run in my environment, but eventually gave up - even after adding a.out support to the kernel the binary was segfaulting at the very beginning.
I've decided to do everything at the remote server, giving up the luxury of having a debugger.

Then it was time to reverse the binary and find some vulnerabilities.
I was able to identify some libc functions by comparing strings in the binary with those in sources.
My focus was on risky string functions like strcpy, strcat or scanf (especially with %s).
Finally I've found an sscanf call that looked vulnerable - it was used to parse a date string into components (year, month, etc) with the following format string: "%d %s %d %d:%d:%d".
The ‘%s’ part (month) was stored in a static stack buffer and there was no length checking at all.
Now I had to find all references to this function and figure out how to trigger it.
Luckily, I was able to find this code in the sources - the function was called NET_ParseDate() and belonged to the core libnet library.

The code was pretty similar, but 1998 version had a length check that was missing in our binary:

        if(255 < XP_STRLEN(ip))

With the help of the sources it was trivial to find how to trigger this code - NET_ParseDate() was used to parse several HTTP headers. I've used Expires in my exploit.

It was time to confirm my suspicions - I wrote a python script to act as a simple HTTP server that will send 300 "A" letters in the Expires header and submitted the URL to the "time machine".

It worked! Connection closed immediately, clearly a different behaviour compared to a response with a standard short header.

The hard part seemed to be over, but there were still some issues:

  1. Since I wasn't able to run the binary locally, I had no idea how the memory was mapped and what protections were in place. Is there any ASLR?
    Is it all RWX?
  2. sscanf stops reading at whitespace, which means I had to avoid several "bad" characters in my payload (most importantly, no null bytes).

Disassembler showed that the code segment starts at a virtual address of 0, but is it true? I've searched the binary for EBFE sequence (jmp .) and used its address to overwrite EIP.
It hung for 10 seconds instead of closing the connection immediately - great, we now have code execution!

But how to get shell? There is no way to interact with stdin, so executing /bin/sh is not enough, we need to be able to run arbitrary commands.
The main issue was lack of any attacker-controlled data at a known address - we don't know where the stack is in memory.

However, looking at libc relative calls it seems that libc is mapped at 0x60000000. I've tested this theory with the help of another EBFE sequence, this time from libc. Another hang - libc found!

Now we can try to do a ROP with libc calls - but only those that passed the "bad character" test.

One easy way to go would be to call read(), to get the next stage from the open socket and place it at a known location - then we could pivot our stack there, or just use it for execve() arguments.

Unfortunately, our socket is at fd 4, which means 3 null bytes that will break our ROP.

One function that was safe to call was strcpy() (provided that both dst and src addresses had no "bad bytes").

I've decided to use strcpy() as a primitive to write arbitrary content by copying sequences of bytes from libc to our destination.

For example, to write "/bin/sleep\x00" to a destination 0x01020304 we need to execute following calls:

strcpy(0x01020304, "/bin/")
strcpy(0x01020304+5, "sl") // There was no "sleep" string in libc
strcpy(0x01020304+7, "ee")
strcpy(0x01020304+9, "p")

In the worst case scenario we need one call per byte, but often we are lucky enough to find longer sequences.
I wrote an encoder to automate this process. Fixing bugs without any debugging feedback except for "hang or crash" was hard, but finally it worked.

My first attempt was to call system() from libc, but it didn't work for some reason (no debugger, so we'll never know).
Then I tried to write to the code segment and it turned out that this memory is mapped RWX, so I just placed my shellcode at a known location and jumped there at the end of the ROP.

Standard shellcraft.i386.linux.dupsh(4) shellcode was enough to get me shell:

[+] Waiting for connections on Got connection from on port 34766
[*] GET / HTTP/1.0
    User-Agent: Mozilla/0.96 Beta (X11; Linux 5.3.0-46-generic x86_64)
    Accept: */*
    Accept: image/gif
    Accept: image/x-xbitmap
    Accept: image/jpeg

[*] Switching to interactive mode
sh: turning off NDELAY mode
$ /readflag


No comments:

Post a Comment