Monday, February 10, 2014

Olympic CTF 2014 - zbin (RE 500)

The zbin task was a simple reverse-engineering/crackme challenge on the Olympic CTF Sochi 2014, which was organized by the MSLC. We've used the word "simple" as the application itself was indeed simple (from our narrow point of view); what made the task worth 500 points was the unusual platform it run on - 64-bit GNU/Linux (SuSE) running on an IBM's s390 CPU. On behalf of the Dragon Sector this task was solved by gynvael with aid from jagger and mak.

The first step was to be able to run and disassemble the binary - the last part turned out to be problematic as the usual tool of our trade - IDA - doesn't support the s390. We ended up creating the following setup:

  • qemu-s390x + dynamic libraries extracted from SuSE s390x RPMs - this was used to run zbin on our local machines. We also tried to use s390x gdb this way to debug the app, but it turned out that some required syscall (ptrace probably) was not supported by qemu-s390x and it just didn't work (we also tried remote connecting s390x gdb to qemu's gdb server, but it turned out not to work at all either). Still, qemu has some tracing capabilities and we did play a little with these.
  • Debian on s390x emulator - we used this to solve zpwn and re-used it for gdb/ltrace/objdump for this task as well; you can read more about this setup in our zpwn write-up: olympic-ctf-2014-zpwn-200.
Surprisingly, we also used IDA, but only to get a general view of the binary and C++ name demangling.

Recon phase

We started with the basics - strings and objdump for disassembly. In the string dump we've spotted two interesting things:

  • zlib's inflateInit/inflate/inflateEnd - we followed up by looked for high-entropy area and then for a zlib/DEFLATE magic, and we did find it at 0x30F0:
    After extracting it and inflating we got a 7MB English wordlist with 670204 entries.
  • "Correct! Use MD5 now." - a hint that the final flag will be the MD5 of whatever is the solution to this crackme. Given the wordlist above, we guessed it would probably have to be an MD5 of a word from that dictionary.

The dead-listing, SIGABRT and the hash

We then focused on the disassembler output itself. Since it was kinda hard to read we decided to take some time and create a Python script that would append a description of every s390 assembly mnemonic in the same line as a given instruction - in the long run this saved us quite a lot of time looking up all the s390 weirdness in the manuals. The final disassembly form we've worked:

After a brief analysis of the code (which did take some time since we were not familiar with the s390's huge instruction set) we were able to pinpoint main(), the decompression of the wordlist and checking if enough parameters were provided (+feeding the first one into strtol). And here is where we got lucky - we decided to run the application with some numbers (123456789) and look for them under the debugger, but before we ever got to that we spotted a SIGABRT being raised:

And then it "clicked" - we partly-deduced and partly-guessed that we need to provide the index of the word in the dictionary and then there will be some-kind-of-a-hash-check, and, if it passes, the message "Correct! Use MD5 now." will be displayed. So we put aside other ideas and began looking for the hash check (going backwards from the string reference).

We found the hashing function quite quickly and it turned out to be really short. We also found the expected value - 619767641. We decided to just calculate the hash (we ported the hashing function to Python) of every word in the dictionary and see if there are any matches vs the expected value. The code that does this:

import md5
d = open("oooo", "rb").read().split("\n")

def fix(x):
  return x & 0xffffffff

def calc(w):
  r3 = 0
  r5 = 0
  for ch in w:
    r2 = ord(ch)
    r2 += r3
    r2 = fix(r2)
    r1 = r2
    r1 = r1 << 10 # or 10
    r1 += r2
    r1 = fix(r1)

    r3 = r1
    r3 = r3 >> 6
    r3 ^= r1
    r5 += 1

  r1 = r3
  r1 = r1 << 3
  r1 += r3
  r1 = fix(r1)
  r3 = r1
  r3 = r3 >> 11
  r3 ^= r1
  r2 = r3
  r2 = r2 << 15
  r2 += r3
  r2 = fix(r2)
  return r2

i = 0
for w in d:
  w = w.strip()
  h = calc(w)
  # Not sure it the constant is hex or dec lol.
  if h == 619767641 or h == 0x619767641:
    print i, w, h, md5.md5(w).hexdigest()
  i += 1

Shortly after we run this we found exactly one match (format: index word hash md5_of_word):

592405 thymicolymphatic 619767641 6b30ce0743be6b6530ecbdb8c6c414bd

And, after adding CTF{...}, it turned out to be the correct flag.

To sum up, as one can see we didn't have to analyze the task too deeply - did in fact skipped 90% of analysis of its logic - this was mostly due to the lucky verbose SIGABRT. All in all it was a really fun task on an interesting and unusual platform.


  1. great and congratz for the CTF, just a question : in the 2nd picture it's debugger or what ?

  2. @hamdi mechergui
    Thanks ;)
    It's just gvim with GNU assembly syntax coloring (the output viewed was taken from objdump as described).

  3. Great write-up, thank you!

    I wonder if you use some tool which automizes the process of searching high-entropy chunks in files?

  4. @cxielamiko
    I use a simple tool we developed with j00ru a long time ago - (it's open source)