## Monday, 26 October 2015

### CTF Writeup - TUM CTF Teaser - whitebox crypto (Rev 20)

• Name - whitebox crypto
• Category - Reverse Engineering
• Points - 20
• Description - Do not panic, it's only XTEA! I wonder what the key was...

According to the description we're required to find the key used by the 64-bit ELF binary to encrypt the input. We're also given the algorithm used: XTEA, eXtended Tiny Encryption Algorithm. The following is a C implementation of the encrypting part of the XTEA algorithm:

```    void encipher (unsigned int num_rounds, uint32_t v[2], uint32_t const key[4]) {
unsigned int i;
uint32_t v0=v[0], v1=v[1], sum=0, delta=0x9E3779B9;
for (i=0; i < num_rounds; i++) {
v0 += (((v1 << 4) ^ (v1 >> 5)) + v1) ^ (sum + key[sum & 3]);
sum += delta;
v1 += (((v0 << 4) ^ (v0 >> 5)) + v0) ^ (sum + key[(sum>>11) & 3]);
}
v[0]=v0; v[1]=v1;
}
```

One cycle of the XTEA algorithm consists of 2 Feistel rounds:

A few things to notice are 1) the key is used as is, i.e. it's not used to derive a longer key and 2) the key is split into 4 equal parts. Let's look at the enciphering function in the binary:

The function has been redacted as it contains 32 cycles. From the above C implementation and the Feistel-rounds image, we know that some constant delta is being added to the key at each cycle. Like the key, the delta is also constant. The fact that there are no instructions in the binary that add 2 constants together means that, the addition of delta and the key has been pre-computed and hard coded in the program. This leaves us with the following 4 potentially-interesting instructions:

```xor eax, 7B707868h
xor r10d, 1B58EA2Eh
xor r9d, 0BA9AE30h
xor r12d, 9BD661DBh
```

The 1st time a part of the key is used, it is used as is since the sum is 0 at this point. The 2nd and 3rd time a part of the key is used, it is incremented by delta, the 4th and 5th by (delta * 2), and so on. With this in hand let's compute the original key.

```
1) 0x7B707868 => {pxh

2) 0x1B58EA2E - Delta => 0x1B58EA2E - 0x9E3779B9 => 0x7D217075 => }!pu

3) 0x0BA9AE30 - Delta => 0x0BA9AE30 - 0x9E3779B9 => 0x6D723477 => mr4w

4) 0x9BD661DB - (2 * Delta) => 0x0BA9AE30 - 0x3C6EF372 => 0x5F676E69 => _gni

```

Combining we get: hxp{w4rming_up!}

## Sunday, 25 October 2015

### CTF Writeup - TUM CTF Teaser - selftest (Rev 10)

• Name - selftest
• Category - Reverse Engineering
• Points - 10
• Description - Baby's 1st

This 64-bit ELF was the easiest of the RE lot. The binary is given to us but to get the flag we need to netcat, suggesting that the key is not embedded in the binary itself. Let's connect to it and try our luck:

Command Prompt
C:\>ncat 1.ctf.link 1060 some_random_stuff :( C:\>

No surprises there. Let's look at it under IDA.

It's easy to see where we need to end up to manage to get the flag. Also, this confirms that the key is not in the binary. The following is the beginning of the program:

The first block tells us that our input is interpreted as hex and is `OR`ed with `RSI`, which is `0x8000000000000000`. The second block takes this number and creates a character-count map of it. For example let's say we input c0ffee, this is then `OR`ed with `0x8000000000000000`, which results in `0x8000000000c0ffee`, giving us the following character-count map:

With this pinned out, we take a look at the validation routine.

The validation loop operates on the character-count map in reverse order and does the following:
1. If the byte read is `0x00`, jump to the next one.
2. If the byte read is not `0x00` and is equal to the character it represents, jump to the next one.
3. If the byte read is not `0x00` and is NOT equal to the character it represents, FAIL.

Simply put, an input string is valid if the occurrences of its bytes are equal to the bytes themselves. Keep in mind that `OR 0x8000000000000000` might mess up the string. This means that the following are all valid strings:
• 8888888 (There's 7 of them because `RSI` starts with `0x8000000000000000`)
• 18888888
• 13338888888

Trying the first one out:

Command Prompt
C:\>ncat 1.ctf.link 1060 8888888 hxp{g00d_m0rning_r3v3r53r5} :) C:\>

## Tuesday, 15 September 2015

### (N)ASM101 - 0x05 - Reversing Binaries - Part II (you_are_very_good_at_this.exe)

The binary we'll be analysing in this tutorial is none other than Challenge 9 from the Flare-On challenge by Fireeye. For those of you who've never heard of it, it's a CTF-style Reverse Engineering challenge. This year, Fireeye has created 11 challenges each of which contain an e-mail address used to obtain the next binary.

Compared to the previous binary we've tackled, this will be much more involving but hang in there, the greater the obstacle the more glory in overcoming it. The operations used by the program to hide the answer are not particularly complicated but there are various anti-disassembly techniques we need to overcome. Fortunately for us there's only 2 very important ASM instructions we need to cover before we start.

## x86 Instructions

### CMPXCHG

The Compare and Exchange operator compares the accumulator (`AL/AX/EAX`) to the first operand. If equal, the first operand (Destination) is loaded with the second (Source). If not, the accumulator is loaded with first operand (Destination). Programmatically it looks something like this:
```    if (Accumulator == Destination) {
ZF = 1;
Destination = Source;
}
else {
ZF = 0;
Accumulator = Destination;
}
```

Let's look at some concrete examples:
```    cmpxchg ecx, edx        ; if EAX = 0x05, ECX = 0x05 & EDX = 0x04 then, ECX = 0x04
; if EAX = 0x03, ECX = 0x05 & EDX = 0x04 then, EAX = 0x05
cmpxchg cx, dx          ; same as above but for Words
cmpxchg [edi], si       ; same as above but the Destination is a memory region
```

### CMOVZ & CMOVNZ

`CMOVZ` stands for Conditional Move if Zero whereas `CMOVNZ` stands for Conditional Move if Not Zero. These instructions can move a value from memory to a register or from one register to another. Unlike the original `MOV` instruction, the source operand cannot be a memory region and immediate values are not allowed anywhere. Their description is self-explainatory but let's look at a few examples:
```    cmovnz eax, ebx        ; if ZF = 0, EAX =  EBX, else nothing
cmovnz eax, [ebx]      ; if ZF = 0, EAX = [EBX], else nothing
cmovz eax, ebx         ; if ZF = 1, EAX =  EBX, else nothing
cmovz eax, [ebx]       ; if ZF = 1, EAX = [EBX], else nothing
```

Basically a move occurs if the condition is satisfied, else the move is not performed. Many other conditional move statements exist but since we'll only encounter 1 of them in the binary, and also because in my experience they're not that common, I won't be covering them, at least this time.

## The Binary

In this tutorial I'll be using IDA once again. Originally I've solved this challenge using Immunity Debugger since IDA kept moaning and groaning about instructions changing during execution. (You'll see what I meant by that statement). Before that though, let's run the program:

Command Prompt
C:\>you_are_very_good_at_this.exe I have evolved since the first challenge. You have not. Bring it. Enter the password> lets_try_our_luck You are failure C:\>

Wow that's harsh!! The binary accepts a password, validates it in some way, and outputs a failure or success (maybe?) message. Guessing and brute-forcing is not going to get us anywhere so let's load the program in IDA and start debugging.

## Your eyes can deceive you, don't trust them

In the functions window, IDA identifies a very complicated but interesting-looking function, sub_401495:

At first glance it seems that we've identified the core function of the program. Guess what? Forget about the function. The code is never reached! This wasted me quite some time during the challenge. When dealing with self-modifying code, static analysis is no good. Having said that, if there's anyone who's got an idea on how to unroll self-modifying code for static analysis, please step forward. Thanks in advance.

While single-stepping through the program there will be times where the Instruction Pointer points to a section which has been interpreted as data such as the following:

If this happens, press "c" to directly convert that part to code. IDA might prompt you for confirmation; press "YES". We can now see that the section makes much more sense:

As if it weren't bad enough, the binary is full of self-modifying code and "conditional" jumps all over the place. Consider the following:

The last line diverts the flow to another location if the Zero Flag is set, but `XOR EAX, EAX` is always zero, hence `ZF` is always set. Also what's this loc_401A54+2? Why not loc_401A56? This happens when the code pointed to by this instruction has been messed up and/or wasn't interpreted correctly. Jumping to it and pressing "c" should clear things up.

## Finding our way round the maze

A good practice when debugging is to set some well-placed breakpoints. For this binary it's easier said than done since we're not guaranteed that the instructions remain consistent throughout execution. We have no choice but to start from the top. We run the program with the "Suspend on process entry point" option set, which is found under "Debugger Setup". This will force the program to halt on the very first instruction. The start of the program looks like this :

While holding F7 (single-stepping) down, we keep an eye on the stack. The program goes through several transformations before it even reaches the point where it asks us to input the password. After some time we get the following stack setup:

At the top we notice a kernel32_WriteFile system call and a pointer to the data section, .data:aIHaveEvolvedSi, which holds the string "I have evolved since the first challenge. You have not. Bring it.". This is followed by a kernel32_ReadFile call and 41 pointers to .text:00401736. The bottom 31 pointers are not shown in the image. This should look familiar as it's the real starting point. It writes the welcoming message to the console, reads our input and .. hmm .. executes some function 41 times. At this point we're not sure what this function does so let's investigate further by setting a breakpoint at address 0x00401736.

We continue the execution (F9), input a random password in the console and hit enter. The program stops at the breakpoint. This happens everytime we try and continue for 41 times after which the program displays the friendly "You are failure" message. Let's look closely at what happens at each iteration of this mysterious function.

## All that glitters is not gold, it can be diamonds

It's time to start filtering the nonsense and identify the operations that matter; those that actually operate on our input, or are influenced by it. We run the program with different passwords to hopefully elicit some change in the program's execution flow. The following is a list of instructions that are affected by our input in each iteration of the function:

1. [0x00401A9C] mov al, [eax+ecx] - Loads a character from our input into `AL`.
2. [0x0014FDB4] mov ah, [esp+ebx+0B4h] - Loads some byte into `AH`.
3. [0x0014FDB8] xor al, ah - `XOR`s our character with this byte.
4. [0x0014FDB4] mov cl, [esp+ebx+88h] - Loads some byte into `CL`.
5. [0x00401B14] rol al, cl - Performs a `ROL` on `AL`, `CL` times.
6. [0x0014FDB8] cmpxchg bl, dl - Compares and Exchanges `BL` with `DL`, if `AL` = `BL`, then `ZF` = 1, else 0.
7. [0x00401B3D] cmovnz esi, edx - If `ZF` = 1, `ESI` remains 0, else `ESI` = `EDX` = 1.
8. [0x00401B49] xchg eax, esi - If `ESI` was 1, now `EAX` is 1, and same for 0.
9. [0x0014FD88] cmpxchg bl, dl - At this point `BL` = 0 and `DL` = 1 so if `AL` = 0, `BL` becomes 1, else `BL` remains 0.
10. [0x00401B74] mov eax, ebx - `EAX` now holds the result, either a 1 or a 0.

At this point I suggest taking some time to go through a single iteration several times, focusing on the instructions I've listed above. The addresses might not match exactly but they should be close. At the end of the each loop, `EAX` is either incremented or isn't, so which one is the right path? The answer lies at the end of the program:

If `EAX` is 0x29, i.e. 41, i.e. the number of iterations our favourite function executes, then jump to the "You are success" section, else jump to the "You are failure" section. Excellent !! We now know that for each iteration of the function, `EAX` has to be incremented. Let's look again at the 10 instructions listed above and, starting from the bottom, make our way up starting from the fact that `EAX` has to be 1:

1. [0x00401B74] mov eax, ebx - `EAX` has to be 1, so `EBX` has to be 1.
2. [0x0014FD88] cmpxchg bl, dl - `EBX` has to be 1, so `AL` has to be 0.
3. [0x00401B49] xchg eax, esi - `AL` has to be 0, so `ESI` has to be 0.
4. [0x00401B3D] cmovnz esi, edx - `ESI` has to be 0, so `ZF` has to be 1.
5. [0x0014FDB8] cmpxchg bl, dl - `ZF` has to be 1, so `AL` has to be equal to `BL`. (Let `BL` be `X`)
6. [0x00401B14] rol al, cl - This operation must be equal to `BL`, i.e. `ROL AL, CL = X`
7. [0x0014FDB4] mov cl, [esp+ebx+88h] - Let `CL` be `Y`, i.e. `ROL AL, Y = X`
8. [0x0014FDB8] xor al, ah - `ROL (XOR AL, AH), Y = X`
9. [0x0014FDB4] mov ah, [esp+ebx+0B4h] - Let `AH` be Z, i.e. `ROL (XOR AL, Z), Y = X`
10. [0x00401A9C] mov al, [eax+ecx] - Let `AL` be `I` for Input, i.e. `ROL (XOR I, Z), Y = X`

Once again, take your time to digest it slowly and refer to the previous 10 bullet points if needs be.

## Diamonds are forever

So finally we're left with a simple formula that contains 4 unknowns, 3 of which can be gathered by going through each function, and our input. Let's change the subject of the constructed formula to `I` since that's what we need to find:

```
(I ^ Z) ROL Y = X

=>                           I ^ Z = X ROR Y

=>                               I = ( X ROR Y ) ^ Z

```

Now it's a matter of going through each iteration of the function to gather `X`, `Y` and `Z`. Let's go through the first few together. Remember that for words, rotate operators repeat themselves every 16 (0x10) times. For example `ROR AX, 56` = `ROR AX, 6` and `ROR AX, F3` = `ROR AX, 3`. So:

```
+-----------------------+-----------------------------------+-------+
|   X   |   Y   |   Z   |   X ROR Y   |   ( X ROR Y ) ^ Z   |   I   |
+-----------------------+-----------------------------------+-------+
|   C3  |  56   |   46  |     0F      |          49         |   I   |
|-------------------------------------------------------------------|
|   CC  |  F5   |   15  |     66      |          73         |   s   |
|-------------------------------------------------------------------|
|   BA  |  AC   |   F4  |     AB      |          5F         |   _   |
|-------------------------------------------------------------------|
|   4E  |  1B   |   BD  |     C9      |          74         |   t   |
+-------------------------------------------------------------------+
|  ...  |  ...  |  ...  |     ...     |          ...        |  ...  |
+-------------------------------------------------------------------+

```

After what feels like a few years we obtain the e-mail address to advance to the next challenge:

Command Prompt
C:\>you_are_very_good_at_this.exe I have evolved since the first challenge. You have not. Bring it. Enter the password> Is_th1s_3v3n_mai_finul_foarm@flare-on.com You are success C:\>

## Conclusion

I admit that this tutorial is quite a leap from the previous ones. Don't get disheartened though!!

For those of you who would like to take a go at the other challenges, Fireeye has made the binaries available for download here. Personally I quite enjoyed them and even though they can be quite frustrating at times, I've definitely learnt a lot .. それでは、また

## Saturday, 12 September 2015

### (N)ASM101 - 0x04 - Reversing Binaries - Part I (EasyCrack.exe)

In this tutorial we'll temporarily deviate from creating our own ASM programs and try to crack a binary which was created by someone else. Writing assembly is one skill, understanding it is another. The binary we'll be analysing can be found here. The keygen was created by Kwazy Webbit from the Reverse Engineering Team.

## x86 Instructions

As with every tutorial we'll first cover the new instructions we'll be encountering during this reverse engineering task.

### AND

The `AND` operator performs a bitwise AND of the operands and places the result in the first operand specified. To refresh your memory, the `AND` operator only returns TRUE if both operands are TRUE. The following is the truth table for this operator:

```
+-----------+-------------------------------+
|     A     |   0   |   0   |   1   |   1   |
|-------------------------------------------|
|     B     |   0   |   1   |   0   |   1   |
+-----------+-------------------------------+
|   A & B   |   0   |   0   |   0   |   1   |
+-----------+-------------------------------+

```

Some examples:
```    and eax, ebx               ;  EAX  =  EAX  &  EBX
and eax, 0Fh               ;  EAX  =  EAX  &  0xF
and [esi], esp             ; [ESI] = [ESI] &  ESP
and [esi], dword 0ABCh     ; [ESI] = [ESI] & 0x0ABC
```

### LEA

`LEA` stands for Load Effective Address and is one of the data movement instructions. Let's look at a few examples:
```    lea eax, [ebp+0C0h]        ; if EBP = 0x0000ABCD, then EAX = 0x0000AC8D
lea eax, [ebx+ebx]         ; if EBX = 0x00001234, then EAX = 0x00002468
lea ebx, [eax+ecx*4]       ; if EAX = 0x00000010 & ECX = 0x0000000A then EBX = 0x00000038
```

The syntax used for `LEA` is quite misleading; even though the square brackets are used, the instruction does not reference memory but rather it evaluates the contents of the square brackets directly.

### MOVZX & MOVSX

These instructions are special versions of the `MOV` instruction which are used to extended signed (`MOVSX`) and unsigned (`MOVZX`) registers to larger registers.

`MOVZX` stands for Move With Zero Extension. When data is moved from a register to a larger register, the value is prepended with zeroes. Some examples are:
```    movzx eax, bx       ; if BX = 0x1000, then EAX = 0x00001000
movzx ebx, ax       ; if AX = 0xFFFF, then EBX = 0x0000FFFF
movzx ax, bl        ; if BL = 0xFF, then AX = 0x00FF
movzx bx, al        ; if AL = 0x7F, then BX = 0x007F
```

`MOVSX` stands for Move with Sign Extension. This time the resultant value is prepended with either zeros or ones depending on the sign of the original value, and hence the sign is preserved. Let's take the previous examples:
```    movsx eax, bx       ; if BX = 0x1000, then EAX = 0x00001000
movsx ebx, ax       ; if AX = 0xFFFF, then EBX = 0xFFFFFFFF
movsx ax, bl        ; if BL = 0xFF, then AX = 0xFFFF
movsx bx, al        ; if AL = 0x7F, then BX = 0x007F
```

### ROL & ROR

`ROL` stands for Rotate Left whereas `ROR` stands for Rotate Right. These instructions shift the bits of the destination to the left/right a number of times; the data that slides off the end of the register is inserted back from the right/left. The following table should clear things up:

```
+---------------+-----------------------------------------------+--------------+
|      EAX      |    1100 0011 1010 1110 0001 1111 0000 0101    |  0xC3AE1F05  |
+---------------+-----------------------------------------------+--------------+
|  ROL EAX, 01h |    1000 0111 0101 1100 0011 1110 0000 1011    |  0x875C3E0B  |
|------------------------------------------------------------------------------|
|  ROL EAX, 06h |    1110 1011 1000 0111 1100 0001 0111 0000    |  0xEB87C170  |
|------------------------------------------------------------------------------|
|  ROL EAX, 0Fh |    0000 1111 1000 0010 1110 0001 1101 0111    |  0x0F82E1D7  |
+---------------+-----------------------------------------------+--------------+
|  ROR EAX, 01h |    1110 0001 1101 0111 0000 1111 1000 0010    |  0xE1D70F82  |
|------------------------------------------------------------------------------|
|  ROR EAX, 06h |    0001 0111 0000 1110 1011 1000 0111 1100    |  0x170EB87C  |
|------------------------------------------------------------------------------|
|  ROR EAX, 0Fh |    0011 1110 0000 1011 1000 0111 0101 1100    |  0x3E0B875C  |
+---------------+-----------------------------------------------+--------------+

```

It stands to reason that rolling a 32-bit register 33 times is the same as rolling it in the same direction once, hence if say the value of a register has to be rolled X times, where X is a large number, instead roll it X % 32 times.

### MUL (IMUL) & DIV (IDIV)

As you've probably guessed `MUL` performs multiplication while `DIV` performs Division; `IMUL` and `IDIV` are the signed versions.

Although in real life these are basic operations, they might be tricky to handle for the CPU which has a limited number of fixed-sized registers to work with. This is because when multiplying 2 large numbers, the result can be too large to fit into a single register; when dividing, the result can be a decimal number with a decimal part which is too long to fit into a single register. To solve this issue, 2 registers are used, namely EAX, the main register, and EDX.

Let's see some examples on how these 2 registers are used to store the results:

```    mul  ecx         ; EDX:EAX = EAX * ECX (unsigned)
imul edx         ; EDX:EAX = EAX * ECX (signed)

imul ecx, 1Fh    ; ECX = ECX * 0x1F (signed)

div ecx          ; EAX = EDX:EAX / ECX (unsigned)
; EDX = EDX:EAX % ECX (unsigned)

div cl           ; AL  = AX / CL (unsigned)
; AH  = AX % CL (unsigned)

idiv ecx         ; EAX = EDX:EAX / ECX (signed)
; EDX = EDX:EAX % ECX (signed)
```

Notice how both of these functions can make use of `EAX`, the accumulator register, without explicit declaration. As mentioned earlier, the `EDX` register is used as a secondary storage for overflows in multiplications and remainders in divisions. Another interesting thing to notice is that `MUL` always operates and stores the result in `EAX/AX/AL` and hence the following instruction is invalid:

```    mul  ecx, 1Fh    ; ECX = ECX * 0x1F  <- INVALID
```

### SBB

`SBB` is Subtract with Borrow/Carry. It differs slightly from the traditional subtract, i.e. `SUB`, in that if the Carry Flag is set, it subtracts 1 extra, otherwise it's exactly the same. For example:
```    sbb  ax, 10h (CF = 1)    ; if EAX was 0x0050, then EAX = 0x003F
sbb  ax, 10h (CF = 0)    ; if EAX was 0x0050, then EAX = 0x0040
```

### JA(JG), JAE(JGE), JB(JL) & JBE(JLE)

These operators are different conditional jump instructions which operate on either signed or unsigned values. The following is a summary table which contains a self-explanatory description and the jump conditions for each:

```
+------------+---------------------------------------+--------------------+
|  Mnemonic  |                Meaning                |   Jump Condition   |
+------------+---------------------------------------+--------------------+
|     JA     |             Jump if Above             |     CF=0 & ZF=0    |
+------------+---------------------------------------+--------------------|
|    JAE     |        Jump if Above or Equal         |        CF=0        |
|------------|---------------------------------------+--------------------|
|     JB     |             Jump if Below             |        CF=1        |
|------------|---------------------------------------+--------------------|
|    JBE     |        Jump if Below or Equal         |     CF=1 | ZF=1    |
+------------+---------------------------------------+--------------------+
|     JG     |       Jump if Greater (signed)        |     ZF=0 & SF=OF   |
|------------|---------------------------------------+--------------------|
|    JGE     |   Jump if Greater or Equal (signed)   |        SF=OF       |
|------------|---------------------------------------+--------------------|
|     JL     |         Jump if Less (signed)         |       SF!=OF       |
|------------|---------------------------------------+--------------------|
|    JLE     |     Jump if Less or Equal (signed)    |    ZF=1 | SF!=OF   |
+------------+---------------------------------------+--------------------+

```

## The Binary

Finally we've made it to the interesting bit. The binary presents us with a tiny window and 2 text fields named Name and Serial, the latter of which can only be numerical. Let's try our luck ...

... anddddd ...

... nope nothing. At this point we don't know if it is expecting a specific Name and Serial, or maybe a specific Name and a derived Serial or even any Name and a Serial number which is derived from it. So let's look at it under IDA.

### Identifying Important Areas

Looking at the strings present in the binary, we notice 2 very interesting ones, "RIGHT" and "You got it!". We know that that's where we need to end up so lets start from there. My personal strategy when tackling binaries in which I known what the point I want to reach is, is to start from there and work my way up as much as I can. Let's apply this.

Moving backwards from the desired result, we start identifying the conditions that have to be met:

1. [0x0040112F] jz short loc_40113D - The Zero Flag must be zero for the program to take the correct path
2. [0x0040112A] test eax, eax - Since we require `ZF` to be zero, `EAX` must NOT be 0x00000000.
3. [0x00401125] call sub_401000 - This requires further investigation but we know that when it ends, `EAX` cannot be zero

Before diving into this unknown function let's see if anything of interest happens before it is reached.

We notice 2 system calls: GetDlgItemTextA and GetDlgItemInt. According to MSDN, GetDlgItemTextA "retrieves the title or text associated with a control in a dialog box" while GetDlgItemInt "translates the text of a specified control in a dialog box into an integer value". These functions retrieve the values for Name and Serial from the textboxes. It seems that nothing of interest happens prior to function sub_401000, so let's focus on it.

### The Main Fuction

The function is manageable but we can still weed out some irrelevant parts so we can focus on the core functionality. Once again we start from the bottom:

1. [0x0040107D] inc eax - `EAX` cannot be 0xFFFFFFFF (Recall: `EAX` cannot be 0 when the function ends)
2. [0x0040107E] sbb eax, eax - Subtracting a number from itself gives zero, so `CF` has to be zero, else we get `EAX` = -1
3. [0x0040107A] neg eax - Only `EAX` = 0x00000000 will not set the Carry Flag
4. [0x00401077] xor eax, [ebp+serial] -To end up with a `EAX` = 0x00000000, `EAX` must be equal to `[ebp+serial]`

Let's stop there for the time being and move to the top. We skip the top most part (not shown in the image above) since it only checks if the length of Name is 0.

The subsequent section loads the name we've supplied and loops over each character, each time operating on `[ebp+name]`. By the end of the loop `[ebp+name]` holds the result. Let us name this end result Name', since it is derived from Name. To keep things simple, and also since we'll be solving the binary for a specific input, we do not need to know exactly what the operations performed on Name, to produce Name', are. The only thing we need to know is the value of Name'. By setting a break point at the end of the function, we get Name' = 0x01E55668.

With the value of Name' in hand, we can focus on the last part of the function. I've displayed this here for reference:

As a reminder from the previous analysis, we require the XOR function at address 0x00401077 to result in a 0. The XOR's operands are Name' and Serial', where Serial' is the resultant of Serial after some operations have been performed on it; we'll see what these operations are. This means that at this point, SERIAL' and NAME' have to be equal and we already know that Name' = 0x01E55668.

From the image above we can see that the operations on Serial are:
1. `XOR` Serial with 0x1337C0DE
2. `SUB` result with 0xBADC0DE5
3. `NOT` result

Combining these we end up with a formula with which we can work out what the Serial for our Name must be. Let's see how it's done.

```
NOT (( Serial ^ 0x1337C0DE ) - 0xBADC0DE5) = 0x01E55668

=>                ( Serial ^ 0x1337C0DE ) - 0xBADC0DE5 = 0xFE1AA997

=>                                 Serial ^ 0x1337C0DE = 0xB8F6B77C

=>                                              Serial = 0xABC177A2

=>                                              Serial = 2881583010

```

Combining everything together we try our luck again ...

... and ..

... we've done it !!

## Conclusion

It was a long, bumpy road but at the end we've manage to find a Name-Serial pair that worked. As an exercise I suggest solving the keygen for a different Name or even for an arbitrary Name.

The plan for the next reverse engineering tutorial is to cover one of the Flare-on challenges so stay tuned .. じゃあ、また

## Wednesday, 29 July 2015

### (N)ASM101 - 0x03 - Improving strlen()

Previously we've looked at how Win32 API functions are called and how to create our own function which computes the length of a string and stores the result in a register. Some people have pointed out to me that my construction of strlen() is very long-winded. The reason I did this is that the implementation I presented uses simpler ASM instructions and I didn't want to overwhelm readers who are new to this field. On the other hand, now is a good time to introduce these improvements.

## x86 Instructions

When it comes to repetitious data movement and comparison, the Intel instruction set provides us with native instructions which faciliate these operations. These conventient instructions will be presented later in this section but first let's cover a few new ASM instructions which will be useful in constructing the improved version of strlen().

### NOT

The `NOT` instruction performs a bitwise NOT of a register or a memory location and stores it in that same location. A logical NOT essentially inverts input bits: `NOT 1` = `0` and `NOT 0` = `1`. Inverting the bits of an integer value is called computing the Ones' Complement of an integer. The following are 2 examples of such:

```
+-------------------------------------------------------------------+----------+----------+
|   +/-  |        |        |        |       |       |       |       | 1s' Comp | Unsigned |
|--------+--------+--------+--------+-------+-------+-------+-------+----------+----------|
|    0   |   0    |   0    |   0    |   0   |   0   |   0   |   0   |    0     |    0     |
|--------+--------+--------+--------+-------+-------+-------+-------+----------+----------|
|    0   |   1    |   1    |   1    |   1   |   1   |   0   |   1   |   125    |   125    |
+--------+--------+--------+--------+-------+-------+-------+-------+----------+----------|
|    1   |   0    |   0    |   0    |   0   |   0   |   1   |   0   |  -125    |   130    |
|--------+--------+--------+--------+-------+-------+-------+-------+----------+----------|
|    1   |   1    |   1    |   1    |   1   |   1   |   1   |   1   |   -0     |   255    |
+-------------------------------------------------------------------+----------+----------+

```

This operation is synonymous to XORing with 0FFFFFFFFh, since `0 XOR 1` = `1` and `1 XOR 1` = `0`, and to subtracting from 0FFFFFFFFh, where 0FFFFFFFFh is the 1s' complement representation of -0.

An evident issue with this representation is that there are 2 representations of 0. Also, the addition of 2 numbers in their 1s' complement representation is not always equal to the 1s' complement representation of the addition of the 2 numbers in decimal form. This irregularity happens when the result overflows and has to be accounted for with a carry:

```
0001 1111     31 +
− 1111 1000     −7
-----------   ----
1 0001 0111     23   <-- Overflow
+ 0000 0001      1   <-- End-around carry
-----------   ----
0001 1000     24   <-- Result: 31 + (-7)

```

These issues have been circumvented by using the Two's complement representation system which will be discussed in the next subsection.

### NEG

`NEG` performs the Two's Complement of the operand's value and overwrites it with the newly computed value. The Two's Complement of a binary number is obtained in the following manner:
1. Scan from right to left until the 1st "1" is found
2. Perform Ones' Complement on the bits to the left of the "1"
OR:
1. Perform Ones' Complement
As an example:

```
+-------------------------------------------------------------------+----------+----------+
|  -128  |   64   |   32   |   16   |   8   |   4   |   2   |   1   | 2s' Comp | Unsigned |
|--------+--------+--------+--------+-------+-------+-------+-------+----------+----------|
|    0   |   0    |   1    |   0    |   1   |   1   |   0   |   0   |    44    |    44    |
|--------+--------+--------+--------+-------+-------+-------+-------+----------+----------|
|    1   |   1    |   0    |   1    |   0   |   1   |   0   |   0   |   -44    |   212    |
+-------------------------------------------------------------------+----------+----------+

```

The advantages of this system is that there's only 1 representation of 0 and the addition, subtraction or multiplication operations are identical to those for unsigned binary numbers. An overflow is simply discarded. Let's go through the previous example using the 2s' Complement system:

```
0001 1111     31 +
− 1111 1001     −7
-----------   ----
1 0001 1000     24   <-- Discard Overflow
-----------   ----
0001 1000     24   <-- Result: 31 + (-7)

```

### REPZ(REPE), REPNZ(REPNE) & REP

The `REP` instruction and its variations are used to repeat operations up to `ECX` times. Each time the secondary instruction is performed, `ECX` is decremented. Repeat while Zero (`REPZ`) and Repeat while Equal (`REPE`) are interchangeable and keep executing the input instruction while `ECX` != 0 and `ZF` = `1`. Repeat while Not Zero (`REPNZ`) and Repeat while Not Equal (`REPNE`) execute the input instruction while `ECX` != 0 and `ZF` != `1`.

We'll look at examples at how these can be combined with `SCAS` to deal with repetitious comparisons in the next subsection.

### SCAS

`SCAS` comes in 3 flavours: `SCASB`, `SCASW`, `SCASD`, that operate at 1-,2-, or 4-byte granularity respectively. `SCAS` implicitly compares `AL/AX/EAX` with data starting at the memory address `EDI`; `EDI` is automatically incremented/decremented depending on the Direction Flag (`DF`).

Consider the following example:

```    mov edi, input       ; assume input is a user defined string
mov ecx, 05h         ; move 0x5 to ecx
mov al, 041h         ; move 0x41 to al (0x41 = 'A')
repne scasb          ; repeatedly compares until ECX = 0 or 'A' is reached
```

Let "input" point to the start of an arbitrary string. The program will compare this string, a byte at a time, with 'A', each time decremeting `ECX`. The program terminates either when `ECX` reaches 0 or when an 'A' is found since the latter will set the Zero Flag, whichever comes first. At this point you've probably figured out that the combination of `REP` and `SCAS` will be at the center of the solution to improving strlen(), and hence of this tutorial.

## The New and Improved strlen()

We now put the newly-learnt techniques to good use:

```;nasm -fwin32 strlen_improved.asm
;Run under a debugger

global _main

section .data
input db "What is the length of this string?",0   ; string to compute length on

section .text
_main:
xor    ecx, ecx      ; ecx = 0x00000000
not    ecx           ; initialize ecx to largest value possible (4,294,967,295 in 32-bit)
xor    al, al        ; al = 0x00
mov    edi, input    ; edi points to start of string
mov    ebx, edi      ; store original pointer
repne  scasb         ; repeatedly compare bytes
sub    edi, ebx      ; subtract to get length + 1
dec    edi           ; decrement edi
done:
int    3             ; debugger interrupt
```

`not ecx`
Performs a `NOT` on 0x00000000, returning 0xFFFFFFFF which is 4,294,967,295 in decimal form, the highest possible 32-bit unsigned integer. This is to avoid `REPNE` from quitting due to `ECX` hitting zero before reaching the end of the string.
`xor al, al`
Zeros out `AL` which will be used by `SCASB` to compare bytes from the input string with. Remember that to find the length of the string we need to iterate over each byte until the `NULL` terminator is encountered
`mov ebx, edi`
Stores the address of the start of the string. This will be used to calculate the final result.
`repne scasb`
Roughly translates to : while (`ZF` != `0` OR `ECX` != `0`) { compare byte with 0x00; increment `EDI`; decrement `ECX`}. The Zero Flag is set when `AL` and the current character that is being compared to are equal, i.e. when 0x00 is encountered.
`sub edi, ebx`
Subtracts the position of the NULL from the start of the string.
`dec edi`
Decrements the result to account for NULL.

Generate the executable using NASM and GoLink:

Command Prompt
C:\>nasm -fwin32 strlen_improved.asm C:\>GoLink /entry _main strlen_improved.obj GoLink.Exe Version 1.0.1.0 - Copyright Jeremy Gordon 2002-2014 - JG@JGnet.co.uk Output file: strlen_improved.exe Format: Win32 Size: 1,536 bytes C:\>

Attach a debugger and look at the result saved in `EDX`.

## Further improvements ?

Of course !! Saving the original pointer to the string, subtracting the new value from it, and decrementing it again is not the most efficient way of obtaining the final result.

At the start of the program we've initialized `ECX` to all 1s, i.e. the maximum unsigned integer, but this can also be interpreted as -1 from a signed-integer point of view. Also, each `REPNE` iteration decremented this value by 1. At the end we get `ECX` = - strlen - 2. Why -2? One from the counted string terminator and another from `ECX` being initialized to -1.

An interesting property of `NOT` is that given any negative number X, `NOT` X = |X| - 1. Combining these 2 facts we can calculate the length of the string directly from `ECX` in the following manner:

```not  ecx  ; ecx = - strlen - 2, i.e. not ecx = |- strlen - 2| - 1  = strlen + 2 - 1  =  strlen + 1
dec  ecx  ; ecx =   strlen + 1, i.e. dec ecx = strlen !!
```

Adding this new improvement we get :

```;nasm -fwin32 strlen_improved_2.asm
;Run under a debugger

global _main

section .data
input db "What is the length of this string?",0   ; string to compute length on

section .text
_main:
xor    ecx, ecx      ; ecx = 0x00000000
not    ecx           ; initialize ecx to -1 (signed int)
xor    al, al        ; al = 0x00
mov    edi, input    ; edi points to start of string
repne  scasb         ; repeatedly compare bytes
not    ecx           ; ecx = strlen + 1
dec    ecx           ; account for 0x00
done:
int    3             ; debugger interrupt
```

## Conclusion

Congratulations to those of you who have made it through all the tutorials thus far. As a teaser, in the next post we'll combine today's ASM program with the one presented in the first to construct a finalized, presentable version of strlen() .. バイバイ

## Tuesday, 14 July 2015

### (N)ASM101 - 0x02 - Constructing strlen()

In the previous post we've looked at some of the most commonly used registers, the stack, some x86 instructions, and how to call a higher level Win32 API function. We've combined these to create an ASM program which displays a message box with a custom message and title.

In this post we'll look at some more interesting x86 instructions and construct the strlen() C++ function from scratch.

## strlen()

The function expects a pointer to a null-terminated string and returns the length, i.e. the number of characters that make up the string without including the terminating null character. As a trivial example let's consider the string "HELLO" shown below:
As portrayed in the image, the string "HELLO" constitutes of 5 characters and the NULL character, hence passing a pointer to this string to strlen() returns 5.

## x86 Instructions

To keep the theory at a minimum, only instructions required to achieve our goal, and some of their variations, will be covered in this section.

### MOVe

The `MOV` instruction is probably the most common ASM instruction. As the name suggests it moves data from one place to another. To be more precise it copies data across rather than move it, since the source value remains intact. The operands can be a register (e.g. `EAX`), a memory location (e.g. `[EAX]`; the memory location pointed at by `EAX`) or an immediate value (e.g. `0FFh`). The following are some examples of such:
```mov  eax, F003h       ; immediate -> reg ;  EAX  =  0xF003
mov  [eax], F003h     ; immediate -> mem ; [EAX] =  0xF003
mov  ebx, eax         ; reg -> reg       ;  EBX  =   EAX
mov  eax, [ebx]       ; mem -> reg       ;  EAX  =  [EBX]
mov  [ebx], eax       ; reg -> mem       ; [EBX] =   EAX
mov  eax, [eax+ebx*4] ; reg -> reg       ;  EAX  =[EAX+EBX*4]
```
Notice that data moves from right to left. This is the Intel syntax, and we'll be sticking exclusively to it. For those of you who are interested, the other major syntax is the AT&T syntax and is predominantly used in *nix environments.

Arithmetic operations can be performed on registers and memory locations. We can add two registers with `ADD`, subtract an immediate value from a memory location with `SUB`, increment values with `INC` and decrement values with `DEC`.
```add esp, 14h            ;  esp  =  esp  + 0x14
add ecx, eax            ;  ecx  =  ecx  + eax
sub ecx, eax            ;  ecx  =  ecx  - eax
sub esp, 0Ch            ;  esp  =  esp  - 0xC
inc eax                 ;  eax  =  eax  + 1
dec eax                 ;  eax  =  eax  - 1
inc dword [eax]         ; [eax] = [eax] + 1
add word [eax], 0FFFFh  ; [eax] = [eax] + 0xFFFF
add dword [eax], 0FFFFh ; [eax] = [eax] + 0xFFFF
```
Notice that when the destination is a memory location, the size of the memory that we want to deal with has to be explicitly specified. Let's see why this is important.

Consider the last 2 of the examples above which at first glance look deceptively equivalent. Say that `EAX` points to an arbitrary location in memory. Also say that if we read a dword in length starting from this location, we retrieve `0x41424344`. This means that if we read a word in length starting from the same location, we get `0x4344`. So:
``` add dword [eax], 0FFFFh  => dword [eax] = 0x41424344 + 0xFFFF
=> dword [eax] = 0x41434343 & word [eax] = 0x4343 & CF=0

add  word [eax], 0FFFFh  => word  [eax] =     0x4344 + 0xFFFF
=> dword [eax] = 0x41424343 & word [eax] = 0x4343 & CF=1
```
Notice the differences. For the `word`-case, the most significant part is untouched since we explicitly told it to consider a `word` in length and the Carry Flag has been set to 1 as the result is larger than a `word`.

### OR/XOR

These instructions perform bitwise (eXclusive)OR of the operands and place the result in the first operand specified. The following are some self-explanatory examples:
```or ebx, ebx            ;  ebx  =  ebx  |  ebx
xor eax, [ecx]         ;  eax  =  eax  ^ [ecx]
or [edx], eax          ; [edx] = [edx] |  eax
xor eax, 0fh           ;  eax  =  eax  ^  0xF
or word [eax], 0FFFFh  ; [eax] = [eax] | 0xFFFF
```
As in the previous case, when dealing with memory locations, the size has to be explicitly specified. `XOR` has 2 interesting properties: XORing twice with the same value yields the original value and XORing a value with itself results in zeroes. The latter is used to clear a register whereas the former is sometimes used as a simple shellcode obfuscating technique to evade Antivirus programs.

### TEST & CMP

`TEST` performs a bitwise `AND` but does not save the result and `CMP` performs a `SUB` without saving the result. These operations are used to set the appropriate flags for subsequent operations which may perform different actions depending on these flags.
```cmp  eax, ecx              ;  eax   -  ecx
test [ebx], eax            ; [ebx]  &  eax
cmp edx, 0FFh              ;  edx   -  0xFF
test dword [eax], 0ABCDh   ; [eax]  & 0xABCD
```
Once again make sure the length is specified when dealing with memory locations.

### JZ(JE), JNZ(JNE) & JMP

Jumps are used to control the flow of an ASM program just like "if..then..else" and "switch" for higher-level programming languages. Jump if Zero (`JZ`) and Jump if Equal (`JE`) are interchangeable since the Zero Flag is set to 1 when two equal values are compared, and same for Jump if Not Zero (`JNZ`) and Jump if Not Equal (`JNE`). `JMP` is an unconditional jump and transfers the control flow irrelevant of the flag values.
```jmp procedure   ; jump to "procedure"
```

### INT

`INT` generates a software interrupt and takes a single hexadecimal value. We will not go through the numerous available interrupts but we talk briefly on the following: 0x21 and 0x3. `INT 0x21` calls an MS-DOS API call depending on the values in the registers. `INT 0x3` (0xCC) is used by debuggers to set a breakpoint.
```int 3 ; execution breaks here if attached to a debugger
```

A complete Intel Syntax Reference guide can be found here.

## Constructing strlen()

We've looked at how strlen() works and familiarized ourselves with more than enough ASM instructions to reconstruct the function from scratch. strlen() may be very simple and straightforward but let us decompose it into smaller, digestible operations. This will help us understand the underlying logic and will make it easier to translate to ASM.

strlen() Algorithm:
1. Set CTR = 0
2. Is char at position CTR = 0x00 ?
3. If Yes GOTO 6
4. Increment CTR
5. GOTO 2
6. DONE: Return CTR
```;nasm -fwin32 strlen.asm
;Run under a debugger

global _main

section .data
input db "What is the length of this string?",0   ; string to compute length on

section .text
_main:
xor     ecx, ecx          ; clear registers to be used
xor     ebx, ebx          ;
mov     edi, input        ; move pointer to start of input to edi
check_next:
mov     bl, [edi+ecx]    ; move char to bl
test    bl, bl           ; check if char = 0x00 (i.e. end of string)
inc     ecx              ; increment ecx (counter)
jmp     check_next       ; jmp to check_next
done:
int     3                ; debugger interrupt
```
`xor ???, ???`
Zeroes out the registers before use. `ECX` is used both as a counter and a pointer to the next character while the lower part of `EBX` holds the character we are comparing to.
`mov edi, input`
Moves the pointer to the start of the string to `EDI`, i.e. `[EDI]` = "W".
`check_next`
A symbolic label used as a reference by jump instructions.
`mov bl, [edi+ecx]`
Moves the char pointed to by `[EDI+ECX]` to the `BL` register.
`test bl, bl`
Reminder: `TEST` performs a bitwise `AND` without storing the result. The only value that returns zero when `AND`ed to itself, is zero. So, if `BL` = 0x0 then `BL & BL` = 0x0, hence `ZF`=1. If `BL` <> 0x0, then `BL & BL` <> 0x0, hence `ZF`=0.
`jz done`
If `ZF`=1, i.e. the previous result was 0x0, i.e. we hit the end of the string, jump to the "done" label.
`inc ecx`
Increment `ECX` which holds a counter to the string length.
`jmp check_next`
Jump to the "check_next" label to repeat the process with an incremented counter and a pointer to the next character.
`int 3`
A breakpoint for debuggers. This will give us the opportunity to take a look at the `ECX` register which, at the point it reaches the interrupt, should contain the length of the string in hexadecimal format.

Generate the executable using NASM and GoLink:

Command Prompt
C:\>nasm -fwin32 strlen.asm C:\>GoLink /entry _main strlen.obj GoLink.Exe Version 1.0.1.0 - Copyright Jeremy Gordon 2002-2014 - JG@JGnet.co.uk Output file: strlen.exe Format: Win32 Size: 1,536 bytes C:\>

As you might have realized by now, we can't just run the executable. We need to attach it to a debugger to be able to view the result in the `ECX` register. In a future post I might demonstrate how to format a registry value and display it in a message box. It's more involved than it sounds.

The following screenshot shows the status of the registers after the debugger hits the `INT` instruction:

`ECX` contains 0x22 which translates to 34, the length of "What is the length of this string?". Play around with it; modify the string and run it step by step to get a better feel for ASM.

## Conclusion

A lot was covered in this post. We've looked at how higher-level functions are constructed using basic assembly instructions. In the next post we won't be creating anything new but rather we'll be improving upon what we've covered here .. さようなら

## Tuesday, 16 June 2015

### (N)ASM101 - 0x01 - Calling Win32 API Functions

Hello and welcome to my first post. A few weeks ago I've acquired an interest in building my own small functions with ASM. I've been an amateur exploit developer for a number of years and used off-the-shelf shellcode quite a lot but I've never gotten my hands dirty with it.

I've decided to start blogging about what I learn thanks to a recent experience. I always believed that teaching was somewhat of a waste of time, and that the time spent teaching could be spent learning new things. This was proven to be wrong when, in a workshop I was delivering (BSides London 2015), a guy pointed out that there are better ways of achieving what I was explaining and that got me thinking. Teaching is not the one way process I had always thought it to be ...

Anyways .. enough about my life and let's get crackin' ..

## Why NASM ?

When it comes to assembly language programming, NASM (Netwide ASM) and MASM (Microsoft Macro Assembler) are considered to be 2 of the top, if not the top, standard assemblers. There is little to no difference in the way ASM is written in both, as they both use Intel syntax, and as one would expected MASM is easier to use in a Windows environment.

For example, MASM comes with `STDIN `and `STDOUT `Windows console functions to read user input and to output strings to the console. Reading and Writing are functions that are used over and over again and hence Microsoft has decided to abstract the Windows functions that handle these operations to make them more accessible. Personally, I think it's very convenient but I don't like taking shortcuts. The purpose here is to learn ASM and since we're already dealing with the lowest level of programming language might aswell go all the way.

NASM is an assembler/disassembler for the Intel x86 architecture used for both Windows and Linux. It can output several binary formats such as Portable Executable and ELF.

Some other minor differences are:
• NASM is case-sensitive
• NASM requires square brackets for memory references
• Macros and Directives work completely different
If anyone is still not convinced by this explanation and would like to use MASM, they are welcome to do so. At the end of the day, the concepts remain the same.

## Equipping ourselves

The 2 programs that will be required to generate executables from ASM are:
NASM converts an ASM file into a Common Object File Format (COFF) file, whilst GoLink packages it with the required DLLs to produce the final PE (exe) file.

## Laying down some theory

Before we dive in we will go over some basic theoretical concepts, namely the Registers and the Stack and how it's utilized when calling Windows functions. Those of you who are familiar with these concepts can jump to the next section.

### The Registers

Registers are small locations used by the processor to perform operations. An x86 processor has 16 of these registers some of which are listed in the table below:

RegisterNameMain Purpose
`EAX`Accumulator RegisterArithmetic Operations
`EBX`Base RegisterBase Pointer for Memory Access
`ECX`Counter RegisterLoop Counter and Shifts
`EDX`Data RegisterAn extension to EAX
`ESI`Source IndexSource Pointer for String Manipulation (Reading)
`EDI`Destination IndexDestination Pointer for String Manipulation (Writing)
`EBP`Base PointerPoints to the base address of the Stack
`ESP`Stack PointerPoints to the top of the Stack
`EIP`Instruction PointerPoints to the next instruction to be executed
`EFLAGS`FLAGS RegisterHolds several Flags e.g.:
- `ZF` (Zero Flag): 1 if result = 0, else 0
- `CF` (Carry Flag):1 if result = -ve, else 0
- `OF` (Overflow Flag): 1 if result overflowed, else 0

The `E` in front of each register signifies that the register contains 32 bits. In the case of 64-bit registers, this is replaced by an `R`, so `EAX` (32-bit) would become `RAX` (64-bit).

Some register can be further divided into 8- and 16-bit registers in the following manner:

### The Stack

A stack is a last-in first-out data structure which supports 2 operations: `PUSH` and `POP`. A `PUSH` instruction will put something on top of the stack whereas a `POP` instruction will remove an item from the top of the stack.

These operations implicitly modify the `ESP` register. When a `PUSH` is executed, `ESP` is decremented and data is written at the memory address `ESP` is pointed at. In a `POP` operation, data is removed from the stack and stored in some other location; `ESP` is incremented.

The stack is used by x86 architectures as extra storage since the amount of data that can be stored in  registers is severely limited. To better understand the role of the stack during execution we'll look at calling the MessageBox Win32 API function and take a close look at what happens when we do so. The following tells us that the MessageBox function is expecting 4 parameters:

```  int WINAPI MessageBox(
_In_opt_ HWND    hWnd,
_In_opt_ LPCTSTR lpText,
_In_opt_ LPCTSTR lpCaption,
_In_     UINT    uType
);
```

This is where the stack comes in. Win32 API functions use the `STDCALL` calling convention. This means that the CPU expects the parameters required by the function to be on the stack when the `CALL` instruction is encountered.

The stack has another very important role. Implicitly a `CALL` instruction performs 2 operations:
• The address right after the `CALL` instruction is pushed on the stack
• The execution flow (`EIP` register) is transferred to the function that is called. In the case of the MessageBox, this is somewhere inside User32.dll.
When the called function finishes executing, the saved address is popped back to `EIP` (`RET` instruction) and the execution of the main program continues.

To summarize, 2 main functions of the stack:
1. Used to pass parameters to the called functions (callees)
2. Used to save the state of the caller function before the execution is transferred to the callee.

## Calling the MessageBox Function

With all the theory behind us it's time to present some code:
```extern _MessageBoxA@16

global _main

section .data
msg  db "Vulnerable Space",0
ttl  db "Welcome",0

section .text
_main:
push  dword 0x00       ; MB_OK = 0
push  dword ttl        ; "Welcome"
push  dword msg        ; "Vulnerable Space"
push  dword 0          ; handle to owner window
call  _MessageBoxA@16  ; in user32.dll
```
It might look convoluted at first but let's take it line by line:
`extern _MessageBoxA@16`
Tells our program that the MessageBoxA function requires importing from an external library (User32.dll). Later we'll see how this is done using GoLink.
`global _main`
`global` is the other end of `extern`; for a function to be importable by external programs using `extern`, it has to be explicitly defined as being global. `_main` is the name of our main function.
`section .data`
This section contains initialized static variables (i.e. global variables) and static local variables.
`msg db "Vulnerable Space",0`
Declares a variable called `msg` (`db` : declare byte) and initializes it to "Vulnerable Space" with a NULL byte at the end. Strings in C have to end with 0x00.
`ttl db "Welcome",0`
Same as above : "Welcome/00".
`section .text`
Contains the actual machine instructions that make our program.
`_main`
The name of the main function our program will run.
`push dword ????`
The 4 `push` instructions place the 4 dword (double word = 4 bytes = 32 bits) parameters required by the message box function on the stack. As described earlier this is necessary when using the `STDCALL` calling convention. The first parameter specifies the type of message box (in this case it's an OK message box) while the last parameter tells the message box that it has no owner window . The other 2 are self-explanatory. More information can be found here.
`call _MessageBoxA@16`
This performs the call to the message box function. The "_" (underscore) prefix is used to call the ASM version of the MessageBoxA C++ function whereas the `@16` postfix signifies that 16 bytes of arguments are passed on the stack (4 dwords).

We now use NASM and GoLink to generate the executable, and run it:

Command Prompt
C:\>nasm -fwin32 msgboxA.asm C:\>GoLink /entry _main msgboxA.obj user32.dll GoLink.Exe Version 1.0.1.0 - Copyright Jeremy Gordon 2002-2014 - JG@JGnet.co.uk Output file: msgboxA.exe Format: Win32 Size: 2,048 bytes C:\>msgboxA.exe

The `-fwin32` switch is used to generate Win32 format Object Files. For GoLink we specify the entry point function with the `/entry` switch, the COFF file generated by NASM, and the libraries required.

Congrats on your first ASM program! The result is as expected, an OK message box with "Vulnerable Space" as the caption and a "Welcome" title:

A few seconds later we get an unexpected(?) crash:

Before we polish our program, it's interesting to look at the resultant PE under a debugger. Here we can see the exact same instructions we have specified in the `.text` section. Conveniently, Immunity debugger maps them to the strings we have initialized them to in the `.data` section.

This is the state of the stack after all the parameters have been pushed onto it and just before the MessageBox function starts executing:

The last thing we'll look into before ending the tutorial is how to get rid of the crash. This happens because, after the MessageBox function ends its execution, we do not specify what the program should do. The function that will solve this issue is another Win32 API function called ExitProcess and resides in Kernel32.dll. More information on this function can be found here.

The final code looks like this:

```;nasm -fwin32 msgboxA.asm
;GoLink /entry _main msgboxA.obj user32.dll kernel32.dll
;msgboxA.exe

extern _ExitProcess@4
extern _MessageBoxA@16

global _main

section .data
msg  db "Vulnerable Space",0
ttl  db "Welcome",0

section .text
_main:
push  dword 0x00       ; MB_OK = 0
push  dword ttl        ; "Welcome"
push  dword msg        ; "Vulnerable Space"
push  dword 0          ; handle to owner window
call  _MessageBoxA@16  ; in user32.dll

push  0                ; no error
call  _ExitProcess@4   ; in kernel32.dll
```

Because of `_ExitProcess@4`, we now need to add "kernel32.dll" when using GoLink. Also notice that we need not clean the stack between function calls. This is a very convenient advantage of the `STDCALL` calling convention (used by Win32 API functions) in which the callee is expected to clear the stack rather than the caller program.

## Conclusion

Hope you've all enjoyed it and feel proud of your first ASM program. You really should! Understanding ASM is no easy feat. As mentioned in the intro, I would love to hear your feedback; any improvements, clarifications, missing detail or suggestions.

In the next post we'll look at some more ASM instructions and try to build our own function .. またね

## Monday, 15 June 2015

### Welcome

Welcome to my space.

Finding time to write blog posts...