## Tuesday, 15 September 2015

### (N)ASM101 - 0x05 - Reversing Binaries - Part II (you_are_very_good_at_this.exe)

The binary we'll be analysing in this tutorial is none other than Challenge 9 from the Flare-On challenge by Fireeye. For those of you who've never heard of it, it's a CTF-style Reverse Engineering challenge. This year, Fireeye has created 11 challenges each of which contain an e-mail address used to obtain the next binary.

Compared to the previous binary we've tackled, this will be much more involving but hang in there, the greater the obstacle the more glory in overcoming it. The operations used by the program to hide the answer are not particularly complicated but there are various anti-disassembly techniques we need to overcome. Fortunately for us there's only 2 very important ASM instructions we need to cover before we start.

## x86 Instructions

### CMPXCHG

The Compare and Exchange operator compares the accumulator (`AL/AX/EAX`) to the first operand. If equal, the first operand (Destination) is loaded with the second (Source). If not, the accumulator is loaded with first operand (Destination). Programmatically it looks something like this:
```    if (Accumulator == Destination) {
ZF = 1;
Destination = Source;
}
else {
ZF = 0;
Accumulator = Destination;
}
```

Let's look at some concrete examples:
```    cmpxchg ecx, edx        ; if EAX = 0x05, ECX = 0x05 & EDX = 0x04 then, ECX = 0x04
; if EAX = 0x03, ECX = 0x05 & EDX = 0x04 then, EAX = 0x05
cmpxchg cx, dx          ; same as above but for Words
cmpxchg [edi], si       ; same as above but the Destination is a memory region
```

### CMOVZ & CMOVNZ

`CMOVZ` stands for Conditional Move if Zero whereas `CMOVNZ` stands for Conditional Move if Not Zero. These instructions can move a value from memory to a register or from one register to another. Unlike the original `MOV` instruction, the source operand cannot be a memory region and immediate values are not allowed anywhere. Their description is self-explainatory but let's look at a few examples:
```    cmovnz eax, ebx        ; if ZF = 0, EAX =  EBX, else nothing
cmovnz eax, [ebx]      ; if ZF = 0, EAX = [EBX], else nothing
cmovz eax, ebx         ; if ZF = 1, EAX =  EBX, else nothing
cmovz eax, [ebx]       ; if ZF = 1, EAX = [EBX], else nothing
```

Basically a move occurs if the condition is satisfied, else the move is not performed. Many other conditional move statements exist but since we'll only encounter 1 of them in the binary, and also because in my experience they're not that common, I won't be covering them, at least this time.

## The Binary

In this tutorial I'll be using IDA once again. Originally I've solved this challenge using Immunity Debugger since IDA kept moaning and groaning about instructions changing during execution. (You'll see what I meant by that statement). Before that though, let's run the program:

Command Prompt
C:\>you_are_very_good_at_this.exe I have evolved since the first challenge. You have not. Bring it. Enter the password> lets_try_our_luck You are failure C:\>

Wow that's harsh!! The binary accepts a password, validates it in some way, and outputs a failure or success (maybe?) message. Guessing and brute-forcing is not going to get us anywhere so let's load the program in IDA and start debugging.

## Your eyes can deceive you, don't trust them

In the functions window, IDA identifies a very complicated but interesting-looking function, sub_401495:

At first glance it seems that we've identified the core function of the program. Guess what? Forget about the function. The code is never reached! This wasted me quite some time during the challenge. When dealing with self-modifying code, static analysis is no good. Having said that, if there's anyone who's got an idea on how to unroll self-modifying code for static analysis, please step forward. Thanks in advance.

While single-stepping through the program there will be times where the Instruction Pointer points to a section which has been interpreted as data such as the following:

If this happens, press "c" to directly convert that part to code. IDA might prompt you for confirmation; press "YES". We can now see that the section makes much more sense:

As if it weren't bad enough, the binary is full of self-modifying code and "conditional" jumps all over the place. Consider the following:

The last line diverts the flow to another location if the Zero Flag is set, but `XOR EAX, EAX` is always zero, hence `ZF` is always set. Also what's this loc_401A54+2? Why not loc_401A56? This happens when the code pointed to by this instruction has been messed up and/or wasn't interpreted correctly. Jumping to it and pressing "c" should clear things up.

## Finding our way round the maze

A good practice when debugging is to set some well-placed breakpoints. For this binary it's easier said than done since we're not guaranteed that the instructions remain consistent throughout execution. We have no choice but to start from the top. We run the program with the "Suspend on process entry point" option set, which is found under "Debugger Setup". This will force the program to halt on the very first instruction. The start of the program looks like this :

While holding F7 (single-stepping) down, we keep an eye on the stack. The program goes through several transformations before it even reaches the point where it asks us to input the password. After some time we get the following stack setup:

At the top we notice a kernel32_WriteFile system call and a pointer to the data section, .data:aIHaveEvolvedSi, which holds the string "I have evolved since the first challenge. You have not. Bring it.". This is followed by a kernel32_ReadFile call and 41 pointers to .text:00401736. The bottom 31 pointers are not shown in the image. This should look familiar as it's the real starting point. It writes the welcoming message to the console, reads our input and .. hmm .. executes some function 41 times. At this point we're not sure what this function does so let's investigate further by setting a breakpoint at address 0x00401736.

We continue the execution (F9), input a random password in the console and hit enter. The program stops at the breakpoint. This happens everytime we try and continue for 41 times after which the program displays the friendly "You are failure" message. Let's look closely at what happens at each iteration of this mysterious function.

## All that glitters is not gold, it can be diamonds

It's time to start filtering the nonsense and identify the operations that matter; those that actually operate on our input, or are influenced by it. We run the program with different passwords to hopefully elicit some change in the program's execution flow. The following is a list of instructions that are affected by our input in each iteration of the function:

1. [0x00401A9C] mov al, [eax+ecx] - Loads a character from our input into `AL`.
2. [0x0014FDB4] mov ah, [esp+ebx+0B4h] - Loads some byte into `AH`.
3. [0x0014FDB8] xor al, ah - `XOR`s our character with this byte.
4. [0x0014FDB4] mov cl, [esp+ebx+88h] - Loads some byte into `CL`.
5. [0x00401B14] rol al, cl - Performs a `ROL` on `AL`, `CL` times.
6. [0x0014FDB8] cmpxchg bl, dl - Compares and Exchanges `BL` with `DL`, if `AL` = `BL`, then `ZF` = 1, else 0.
7. [0x00401B3D] cmovnz esi, edx - If `ZF` = 1, `ESI` remains 0, else `ESI` = `EDX` = 1.
8. [0x00401B49] xchg eax, esi - If `ESI` was 1, now `EAX` is 1, and same for 0.
9. [0x0014FD88] cmpxchg bl, dl - At this point `BL` = 0 and `DL` = 1 so if `AL` = 0, `BL` becomes 1, else `BL` remains 0.
10. [0x00401B74] mov eax, ebx - `EAX` now holds the result, either a 1 or a 0.

At this point I suggest taking some time to go through a single iteration several times, focusing on the instructions I've listed above. The addresses might not match exactly but they should be close. At the end of the each loop, `EAX` is either incremented or isn't, so which one is the right path? The answer lies at the end of the program:

If `EAX` is 0x29, i.e. 41, i.e. the number of iterations our favourite function executes, then jump to the "You are success" section, else jump to the "You are failure" section. Excellent !! We now know that for each iteration of the function, `EAX` has to be incremented. Let's look again at the 10 instructions listed above and, starting from the bottom, make our way up starting from the fact that `EAX` has to be 1:

1. [0x00401B74] mov eax, ebx - `EAX` has to be 1, so `EBX` has to be 1.
2. [0x0014FD88] cmpxchg bl, dl - `EBX` has to be 1, so `AL` has to be 0.
3. [0x00401B49] xchg eax, esi - `AL` has to be 0, so `ESI` has to be 0.
4. [0x00401B3D] cmovnz esi, edx - `ESI` has to be 0, so `ZF` has to be 1.
5. [0x0014FDB8] cmpxchg bl, dl - `ZF` has to be 1, so `AL` has to be equal to `BL`. (Let `BL` be `X`)
6. [0x00401B14] rol al, cl - This operation must be equal to `BL`, i.e. `ROL AL, CL = X`
7. [0x0014FDB4] mov cl, [esp+ebx+88h] - Let `CL` be `Y`, i.e. `ROL AL, Y = X`
8. [0x0014FDB8] xor al, ah - `ROL (XOR AL, AH), Y = X`
9. [0x0014FDB4] mov ah, [esp+ebx+0B4h] - Let `AH` be Z, i.e. `ROL (XOR AL, Z), Y = X`
10. [0x00401A9C] mov al, [eax+ecx] - Let `AL` be `I` for Input, i.e. `ROL (XOR I, Z), Y = X`

Once again, take your time to digest it slowly and refer to the previous 10 bullet points if needs be.

## Diamonds are forever

So finally we're left with a simple formula that contains 4 unknowns, 3 of which can be gathered by going through each function, and our input. Let's change the subject of the constructed formula to `I` since that's what we need to find:

```
(I ^ Z) ROL Y = X

=>                           I ^ Z = X ROR Y

=>                               I = ( X ROR Y ) ^ Z

```

Now it's a matter of going through each iteration of the function to gather `X`, `Y` and `Z`. Let's go through the first few together. Remember that for words, rotate operators repeat themselves every 16 (0x10) times. For example `ROR AX, 56` = `ROR AX, 6` and `ROR AX, F3` = `ROR AX, 3`. So:

```
+-----------------------+-----------------------------------+-------+
|   X   |   Y   |   Z   |   X ROR Y   |   ( X ROR Y ) ^ Z   |   I   |
+-----------------------+-----------------------------------+-------+
|   C3  |  56   |   46  |     0F      |          49         |   I   |
|-------------------------------------------------------------------|
|   CC  |  F5   |   15  |     66      |          73         |   s   |
|-------------------------------------------------------------------|
|   BA  |  AC   |   F4  |     AB      |          5F         |   _   |
|-------------------------------------------------------------------|
|   4E  |  1B   |   BD  |     C9      |          74         |   t   |
+-------------------------------------------------------------------+
|  ...  |  ...  |  ...  |     ...     |          ...        |  ...  |
+-------------------------------------------------------------------+

```

After what feels like a few years we obtain the e-mail address to advance to the next challenge:

Command Prompt
C:\>you_are_very_good_at_this.exe I have evolved since the first challenge. You have not. Bring it. Enter the password> Is_th1s_3v3n_mai_finul_foarm@flare-on.com You are success C:\>

## Conclusion

I admit that this tutorial is quite a leap from the previous ones. Don't get disheartened though!!

For those of you who would like to take a go at the other challenges, Fireeye has made the binaries available for download here. Personally I quite enjoyed them and even though they can be quite frustrating at times, I've definitely learnt a lot .. それでは、また

## Saturday, 12 September 2015

### (N)ASM101 - 0x04 - Reversing Binaries - Part I (EasyCrack.exe)

In this tutorial we'll temporarily deviate from creating our own ASM programs and try to crack a binary which was created by someone else. Writing assembly is one skill, understanding it is another. The binary we'll be analysing can be found here. The keygen was created by Kwazy Webbit from the Reverse Engineering Team.

## x86 Instructions

As with every tutorial we'll first cover the new instructions we'll be encountering during this reverse engineering task.

### AND

The `AND` operator performs a bitwise AND of the operands and places the result in the first operand specified. To refresh your memory, the `AND` operator only returns TRUE if both operands are TRUE. The following is the truth table for this operator:

```
+-----------+-------------------------------+
|     A     |   0   |   0   |   1   |   1   |
|-------------------------------------------|
|     B     |   0   |   1   |   0   |   1   |
+-----------+-------------------------------+
|   A & B   |   0   |   0   |   0   |   1   |
+-----------+-------------------------------+

```

Some examples:
```    and eax, ebx               ;  EAX  =  EAX  &  EBX
and eax, 0Fh               ;  EAX  =  EAX  &  0xF
and [esi], esp             ; [ESI] = [ESI] &  ESP
and [esi], dword 0ABCh     ; [ESI] = [ESI] & 0x0ABC
```

### LEA

`LEA` stands for Load Effective Address and is one of the data movement instructions. Let's look at a few examples:
```    lea eax, [ebp+0C0h]        ; if EBP = 0x0000ABCD, then EAX = 0x0000AC8D
lea eax, [ebx+ebx]         ; if EBX = 0x00001234, then EAX = 0x00002468
lea ebx, [eax+ecx*4]       ; if EAX = 0x00000010 & ECX = 0x0000000A then EBX = 0x00000038
```

The syntax used for `LEA` is quite misleading; even though the square brackets are used, the instruction does not reference memory but rather it evaluates the contents of the square brackets directly.

### MOVZX & MOVSX

These instructions are special versions of the `MOV` instruction which are used to extended signed (`MOVSX`) and unsigned (`MOVZX`) registers to larger registers.

`MOVZX` stands for Move With Zero Extension. When data is moved from a register to a larger register, the value is prepended with zeroes. Some examples are:
```    movzx eax, bx       ; if BX = 0x1000, then EAX = 0x00001000
movzx ebx, ax       ; if AX = 0xFFFF, then EBX = 0x0000FFFF
movzx ax, bl        ; if BL = 0xFF, then AX = 0x00FF
movzx bx, al        ; if AL = 0x7F, then BX = 0x007F
```

`MOVSX` stands for Move with Sign Extension. This time the resultant value is prepended with either zeros or ones depending on the sign of the original value, and hence the sign is preserved. Let's take the previous examples:
```    movsx eax, bx       ; if BX = 0x1000, then EAX = 0x00001000
movsx ebx, ax       ; if AX = 0xFFFF, then EBX = 0xFFFFFFFF
movsx ax, bl        ; if BL = 0xFF, then AX = 0xFFFF
movsx bx, al        ; if AL = 0x7F, then BX = 0x007F
```

### ROL & ROR

`ROL` stands for Rotate Left whereas `ROR` stands for Rotate Right. These instructions shift the bits of the destination to the left/right a number of times; the data that slides off the end of the register is inserted back from the right/left. The following table should clear things up:

```
+---------------+-----------------------------------------------+--------------+
|      EAX      |    1100 0011 1010 1110 0001 1111 0000 0101    |  0xC3AE1F05  |
+---------------+-----------------------------------------------+--------------+
|  ROL EAX, 01h |    1000 0111 0101 1100 0011 1110 0000 1011    |  0x875C3E0B  |
|------------------------------------------------------------------------------|
|  ROL EAX, 06h |    1110 1011 1000 0111 1100 0001 0111 0000    |  0xEB87C170  |
|------------------------------------------------------------------------------|
|  ROL EAX, 0Fh |    0000 1111 1000 0010 1110 0001 1101 0111    |  0x0F82E1D7  |
+---------------+-----------------------------------------------+--------------+
|  ROR EAX, 01h |    1110 0001 1101 0111 0000 1111 1000 0010    |  0xE1D70F82  |
|------------------------------------------------------------------------------|
|  ROR EAX, 06h |    0001 0111 0000 1110 1011 1000 0111 1100    |  0x170EB87C  |
|------------------------------------------------------------------------------|
|  ROR EAX, 0Fh |    0011 1110 0000 1011 1000 0111 0101 1100    |  0x3E0B875C  |
+---------------+-----------------------------------------------+--------------+

```

It stands to reason that rolling a 32-bit register 33 times is the same as rolling it in the same direction once, hence if say the value of a register has to be rolled X times, where X is a large number, instead roll it X % 32 times.

### MUL (IMUL) & DIV (IDIV)

As you've probably guessed `MUL` performs multiplication while `DIV` performs Division; `IMUL` and `IDIV` are the signed versions.

Although in real life these are basic operations, they might be tricky to handle for the CPU which has a limited number of fixed-sized registers to work with. This is because when multiplying 2 large numbers, the result can be too large to fit into a single register; when dividing, the result can be a decimal number with a decimal part which is too long to fit into a single register. To solve this issue, 2 registers are used, namely EAX, the main register, and EDX.

Let's see some examples on how these 2 registers are used to store the results:

```    mul  ecx         ; EDX:EAX = EAX * ECX (unsigned)
imul edx         ; EDX:EAX = EAX * ECX (signed)

imul ecx, 1Fh    ; ECX = ECX * 0x1F (signed)

div ecx          ; EAX = EDX:EAX / ECX (unsigned)
; EDX = EDX:EAX % ECX (unsigned)

div cl           ; AL  = AX / CL (unsigned)
; AH  = AX % CL (unsigned)

idiv ecx         ; EAX = EDX:EAX / ECX (signed)
; EDX = EDX:EAX % ECX (signed)
```

Notice how both of these functions can make use of `EAX`, the accumulator register, without explicit declaration. As mentioned earlier, the `EDX` register is used as a secondary storage for overflows in multiplications and remainders in divisions. Another interesting thing to notice is that `MUL` always operates and stores the result in `EAX/AX/AL` and hence the following instruction is invalid:

```    mul  ecx, 1Fh    ; ECX = ECX * 0x1F  <- INVALID
```

### SBB

`SBB` is Subtract with Borrow/Carry. It differs slightly from the traditional subtract, i.e. `SUB`, in that if the Carry Flag is set, it subtracts 1 extra, otherwise it's exactly the same. For example:
```    sbb  ax, 10h (CF = 1)    ; if EAX was 0x0050, then EAX = 0x003F
sbb  ax, 10h (CF = 0)    ; if EAX was 0x0050, then EAX = 0x0040
```

### JA(JG), JAE(JGE), JB(JL) & JBE(JLE)

These operators are different conditional jump instructions which operate on either signed or unsigned values. The following is a summary table which contains a self-explanatory description and the jump conditions for each:

```
+------------+---------------------------------------+--------------------+
|  Mnemonic  |                Meaning                |   Jump Condition   |
+------------+---------------------------------------+--------------------+
|     JA     |             Jump if Above             |     CF=0 & ZF=0    |
+------------+---------------------------------------+--------------------|
|    JAE     |        Jump if Above or Equal         |        CF=0        |
|------------|---------------------------------------+--------------------|
|     JB     |             Jump if Below             |        CF=1        |
|------------|---------------------------------------+--------------------|
|    JBE     |        Jump if Below or Equal         |     CF=1 | ZF=1    |
+------------+---------------------------------------+--------------------+
|     JG     |       Jump if Greater (signed)        |     ZF=0 & SF=OF   |
|------------|---------------------------------------+--------------------|
|    JGE     |   Jump if Greater or Equal (signed)   |        SF=OF       |
|------------|---------------------------------------+--------------------|
|     JL     |         Jump if Less (signed)         |       SF!=OF       |
|------------|---------------------------------------+--------------------|
|    JLE     |     Jump if Less or Equal (signed)    |    ZF=1 | SF!=OF   |
+------------+---------------------------------------+--------------------+

```

## The Binary

Finally we've made it to the interesting bit. The binary presents us with a tiny window and 2 text fields named Name and Serial, the latter of which can only be numerical. Let's try our luck ...

... anddddd ...

... nope nothing. At this point we don't know if it is expecting a specific Name and Serial, or maybe a specific Name and a derived Serial or even any Name and a Serial number which is derived from it. So let's look at it under IDA.

### Identifying Important Areas

Looking at the strings present in the binary, we notice 2 very interesting ones, "RIGHT" and "You got it!". We know that that's where we need to end up so lets start from there. My personal strategy when tackling binaries in which I known what the point I want to reach is, is to start from there and work my way up as much as I can. Let's apply this.

Moving backwards from the desired result, we start identifying the conditions that have to be met:

1. [0x0040112F] jz short loc_40113D - The Zero Flag must be zero for the program to take the correct path
2. [0x0040112A] test eax, eax - Since we require `ZF` to be zero, `EAX` must NOT be 0x00000000.
3. [0x00401125] call sub_401000 - This requires further investigation but we know that when it ends, `EAX` cannot be zero

Before diving into this unknown function let's see if anything of interest happens before it is reached.

We notice 2 system calls: GetDlgItemTextA and GetDlgItemInt. According to MSDN, GetDlgItemTextA "retrieves the title or text associated with a control in a dialog box" while GetDlgItemInt "translates the text of a specified control in a dialog box into an integer value". These functions retrieve the values for Name and Serial from the textboxes. It seems that nothing of interest happens prior to function sub_401000, so let's focus on it.

### The Main Fuction

The function is manageable but we can still weed out some irrelevant parts so we can focus on the core functionality. Once again we start from the bottom:

1. [0x0040107D] inc eax - `EAX` cannot be 0xFFFFFFFF (Recall: `EAX` cannot be 0 when the function ends)
2. [0x0040107E] sbb eax, eax - Subtracting a number from itself gives zero, so `CF` has to be zero, else we get `EAX` = -1
3. [0x0040107A] neg eax - Only `EAX` = 0x00000000 will not set the Carry Flag
4. [0x00401077] xor eax, [ebp+serial] -To end up with a `EAX` = 0x00000000, `EAX` must be equal to `[ebp+serial]`

Let's stop there for the time being and move to the top. We skip the top most part (not shown in the image above) since it only checks if the length of Name is 0.

The subsequent section loads the name we've supplied and loops over each character, each time operating on `[ebp+name]`. By the end of the loop `[ebp+name]` holds the result. Let us name this end result Name', since it is derived from Name. To keep things simple, and also since we'll be solving the binary for a specific input, we do not need to know exactly what the operations performed on Name, to produce Name', are. The only thing we need to know is the value of Name'. By setting a break point at the end of the function, we get Name' = 0x01E55668.

With the value of Name' in hand, we can focus on the last part of the function. I've displayed this here for reference:

As a reminder from the previous analysis, we require the XOR function at address 0x00401077 to result in a 0. The XOR's operands are Name' and Serial', where Serial' is the resultant of Serial after some operations have been performed on it; we'll see what these operations are. This means that at this point, SERIAL' and NAME' have to be equal and we already know that Name' = 0x01E55668.

From the image above we can see that the operations on Serial are:
1. `XOR` Serial with 0x1337C0DE
2. `SUB` result with 0xBADC0DE5
3. `NOT` result

Combining these we end up with a formula with which we can work out what the Serial for our Name must be. Let's see how it's done.

```
NOT (( Serial ^ 0x1337C0DE ) - 0xBADC0DE5) = 0x01E55668

=>                ( Serial ^ 0x1337C0DE ) - 0xBADC0DE5 = 0xFE1AA997

=>                                 Serial ^ 0x1337C0DE = 0xB8F6B77C

=>                                              Serial = 0xABC177A2

=>                                              Serial = 2881583010

```

Combining everything together we try our luck again ...

... and ..

... we've done it !!

## Conclusion

It was a long, bumpy road but at the end we've manage to find a Name-Serial pair that worked. As an exercise I suggest solving the keygen for a different Name or even for an arbitrary Name.

The plan for the next reverse engineering tutorial is to cover one of the Flare-on challenges so stay tuned .. じゃあ、また