Friday 4 November 2016

CTF Writeup - Flare-On 2016 - 08: CHIMERA


  • Name - CHIMERA
  • Category - Reverse Engineering
  • Points - 8
  • Description - n/a
  • Binary - Download here

This challenge was my personal favourite out of all 10 challenges. It's not particularly hard once you figure out what the trick is but the twist is what makes it interesting. Let's run the binary with some random input:


Command Prompt
C:\>CHIMERA.EXE This one's for the geezers. Enter the password> who_are_you_calling_a_geezer You have fail. Please try harder. C:\>



Opening the binary in IDA shows a very simple function which reminds me of beginner-level crackmes:


From the image above we can immediately deduce that our input has to be of length 0x1A; anything beyond that will not be considered. The function operates on each byte of our input individually, adding the value of the counter to it, XOR'ing it with 0x7A and compares it with a buffer. Let's implement the inverse of this function to get the expected input.


    import sys

    byte_4021D2 = ["0E","13","11","0C","5E","14","03","5D","06","0B",
                   "15","51","F9","05","07","07","0D","4B","F8","0E",
                   "FD","F2","F7","FC","F0","07"]

    for x in range(len(byte_4021D2)):
        operation_1 = int(byte_4021D2[x],16) ^ 0x7A
        operation_2 = operation_1 - x
        sys.stdout.write(chr(operation_2))

Inputting the output result,"this is the wrong password", we get:


Command Prompt
C:\>CHIMERA.EXE This one's for the geezers. Enter the password> this is the wrong password You have fail. Please try harder. C:\>


But why !!?? If we take a look at the image above again, there's a 2-line basic block with a compare: CMP [ebp+var_10], 0x9F5, which succeeds if we give it "this is the wrong password" and hence outputs the failure message. Can we get the binary to succeed on both computations, the byte-by-byte comparison and the final comparison? The answer is NO.

The twist is ........ there's another way of running the binary! The following are a few hints that indicate this:
  • The string "This program cannot not be run in DOS mode." (Notice the double negation)
  • The name of the binary, CHIMERA.EXE, is an 8.3 filename
  • IDA complains that the import segment is broken
  • The .text section is RWX

The binary is in fact a 16-bit packed executable. The only way I found to execute and debug this binary is to use the debug version of DOSBox. This can be downloaded from here. Run the installer to get the standard version of DOSBox and then place dosbox-74-debug.exe in the installation folder and run it. This should open 2 windows, the standard console and the debug window:



The following are a few DOSBox debugger shortcuts and commands that proved to be useful to navigate the cluttered interface:
  • F5 - Run
  • F10/F11 - step over / trace into
  • Up/Down - Move code view cursor
  • Page Up/Down - Scroll data view
  • Home/End - Scroll log messages
  • BP [segment]:[offset] - Set breakpoint
  • BPDEL * - Delete all breakpoints
  • Home/End - Scroll log messages
  • SR [reg] [value] - Set register value
  • C / D [segment]:[offset] - Set code / data view address

To start debugging CHIMERA.EXE mount the C drive with 'mount C C:\', navigate to the folder and run 'debug CHIMERA.EXE'. The execution now shifts to the debugger and we can start stepping into the binary. The first hurdle we get is the following:


    020E:07D7  mov  ax,2A00
    020E:07DA  int  21
    020E:07DC  sub  cx,07C6
    020E:07E0  jg   000008B0 ($+cc)        (down)


The instruction int 21 is an interrupt which performs actions depending on the contents of AH. A list of these can found here. The ASM above grabs the system date, puts the year in CX, subtracts 0x7C6 (1990) from it, and jumps if it's greater. As we don't want it to jump we can issue 'sr sf 1' to change the sign flag and not take the jump.

After the simple check, we get the following algorithm which operates on the input string in reverse order:


    020E:0802  mov  dl,97             ; Loads 0x97, this is the first input
    020E:0804  mov  ah,[075F]         ; AH = length of input string
    020E:0808  cmp  cl,ah             ; CL = Counter
    020E:080A  je   0000084E ($+42)   ; jmp if all bytes are processed
    020E:080C  jne  0000080F ($+1)    ; jmp if NOT all bytes are processed

    020E:080F  mov  al,ah
    020E:0811  dec  al
    020E:0813  sub  al,cl
    020E:0815  test cl,cl
    020E:0817  je   0000082D ($+14)   ; jmp if 1st iteration of loop
    020E:0819  je   0000081E ($+3)    ; 
    020E:081B  jne  0000081E ($+1)    ; jmp if NOT 1st iteration of loop

    020E:081E  mov  bx,0760
    020E:0821  movzx dx,sp
    020E:0824  add  bx,dx
    020E:0826  jne  00000829 ($+1)

    020E:0829  sub  bx,cx             
    020E:082B  mov  dl,[bx]          ; loads char from input
    020E:082D  rol  dl,03            ; rol3 on char
    020E:0830  movzx bx,dx
    020E:0833  je   00000838 ($+3)
    020E:0835  jne  00000838 ($+1)

    020E:0838  movzx bx,[bx+0461]    ; performs mapping from constant byte array
    020E:083D  mov  dl,[bx+0461]     ; performs another mapping from constant byte array
    020E:0841  movzx bx,ax
    020E:0844  xor  [bx+0760],dl     ; XOR result with next input char
    020E:0848  inc  cx
    020E:0849  jne  0000084C ($+1)


    020E:084C  jmp  short 00000808 ($-46)  ; jumps to 020E:0808



The algorithm is pretty straight forward: loads 0x97, rotates it, translates it twice from a fixed buffer and 1) XOR's it with the next char; 2) stores it as input for the next rotation. In code this looks like this:


    translate_map = ["FF","15","74","20","40","00","89","EC","5D","C3","42","46",
                     "C0","63","86","2A","AB","08","BF","8C","4C","25","19","31",
                     
                     [ .. snip .. ]

                     "AE","CA","C4","77","0C","4E","D4","D0","C8","E1","B8","F9",
                     "26","90","81","34"]

    rol = lambda val, r_bits, max_bits: \
        (val << r_bits%max_bits) & (2**max_bits-1) | \
        ((val & (2**max_bits-1)) >> (max_bits-(r_bits%max_bits)))

 
    def rol_hex(hex_string):
        return rol(int(hex_string,16),3, 8)

    def double_translate(x):
        return translate_map[int(translate_map[x],16)]
 
    input = "01234567890123456789012345"

    input_list = [ord(x) for x in input]
    input_list.append(0x97)
    input_list.reverse()

    for x in range(1, len(input_list)):
        x_rol3 = rol(input_list[x-1], 3, 8)
        input_list[x] = int(double_translate(x_rol3),16) ^ input_list[x]
  
    input_list.reverse()
    print [hex(x) for x in input_list] 

The resultant byte array is then fed into a second, much simpler algorithm:


    020E:0854  mov  dl,C5            ; Loads 0xC6, this is the first input
    020E:0856  cmp  cl,ah            ; AH = length of previous buffer
    020E:0858  je   00000875 ($+1b)  ; never taken
    020E:085A  test cl,cl
    020E:085C  je   00000869 ($+b)   ; jmp if 1st iteration of loop
    020E:085E  jne  00000862 ($+2)   ; jmp if NOT 1st iteration of loop

    020E:0862  mov  bx,cx
    020E:0864  dec  bx
    020E:0865  mov  dl,[bx+0760]     ; loads char from input
    020E:0869  mov  bx,cx
    020E:086B  xor  [bx+0760],dl     ; XOR's it with next char from input
    020E:086F  inc  cx
    020E:0870  jne  00000873 ($+1)         ; always taken
    020E:0872  jmp  short 0000085F ($-15)  
    020E:0873  jmp  short 00000856 ($-1f)  ; jmp back to 020E:0856


This can be easily translated into the following code, which takes the result from the previous algorithm:


    input_list.insert(0, 0xC5)
    for x in range(1, len(input_list)):
        input_list[x] = input_list[x] ^ input_list[x-1]
 
    print [hex(x) for x in input_list[1:-1]]

The output then goes into a loop, consisting of 0x1A rounds, which compares each byte to a fixed buffer. At this point we have the inner workings of both algorithms performed on the input and the expected final buffer. We can therefore invert the process to get the correct input:


    end_buffer = ["38","E1","4A","1B","0C","1A","46","46","0A","96","29","73","73",
                  "A4","69","03","00","1B","A8","F8","B8","24","16","D6","09","CB",
                  "B9","70","00","89","CB","4B"]


    translate_map = ["FF","15","74","20","40","00","89","EC","5D","C3","42","46",
                     "C0","63","86","2A","AB","08","BF","8C","4C","25","19","31",
                     
                     [ .. snip .. ]

                     "AE","CA","C4","77","0C","4E","D4","D0","C8","E1","B8","F9",
                     "26","90","81","34"]

    rol = lambda val, r_bits, max_bits: \
        (val << r_bits%max_bits) & (2**max_bits-1) | \
        ((val & (2**max_bits-1)) >> (max_bits-(r_bits%max_bits)))
 
    def rol_hex(hex_string):
        return rol(int(hex_string,16),3, 8)
 
 
    def double_translate(x):
        return translate_map[int(translate_map[x],16)]

 
    end_buffer_int = [int(x,16) for x in end_buffer]
    end_buffer_int.insert(0, 0xC5)
    end_buffer_int.reverse()

    for x in range(len(end_buffer_int) - 1 ):
        end_buffer_int[x] = end_buffer_int[x] ^ end_buffer_int[x+1]
 
    end_buffer_int.reverse()

    #remove 0xC5
    end_buffer_int = end_buffer_int[1:]

    end_buffer_int.append(0x97)

    for x in range(0, len(end_buffer_int)-1):
        x_rol3 = rol(end_buffer_int[x+1], 3, 8)
        end_buffer_int[x] = int(double_translate(x_rol3),16) ^ end_buffer_int[x]
 
    print ''.join([chr(x) for x in end_buffer_int])


Testing the result:


DOSBox 0.74, Cpu speed: 3000 cylces, Frameskip 0, Program: DOSBOX
C:\>debug CHIMERA.EXE This program cannot not be run in DOS mode. This one's for the geezers. Enter the password> retr0_hack1ng@flare-on.com You have succeed. C:\>

No comments:

Post a Comment