Skip to content

Basic Assembly tutorial

simon987 edited this page Nov 3, 2019 · 27 revisions

Tutorial under construction, join the Slack team for help

The very basics

Binary and hexadecimal number representation

Computers can only "understand" the binary numbers (or base-2) as an electrical signal in its circuits. It is represented with the symbols 0 and 1. In this tutorial, each digit of a binary number is called a bit. We say that a bit is 'set' when it is equal to 1.

For example, 0000 0101 is a 8-bit number (also called a byte) and 0000 0000 0000 0101 is a 16-bit binary number (also called a word).

A number can be represented in a lot of different ways: 0101 is base-2 is the same number as 5 in base-10 and 0x5 in base-16. You don't really convert a binary number to a decimal number, you just change the way it is represented.

Let's take a look at this number representation table for unsigned 4-bit numbers:

base-16 base-2 base-10
0x0 0000 0
0x1 0001 1
0x2 0010 2
0x3 0011 3
0x4 0100 4
0x5 0101 5
0x6 0110 6
0x7 0111 7
0x8 1000 8
0x9 1001 9
0xA 1010 10
0xB 1011 11
0xC 1100 12
0xD 1101 13
0xE 1110 14
0xF 1111 15

Note that a 4-bit binary number can be easily represented with a single hexadecimal (base-16) digit. Hexadecimal numbers are often a lot easier to read than a string of bits and this is why it is widely used by programmers to represent binary numbers. Hexadecimal numbers are usually prefixed with 0x.

//todo: convert bin to dec? or maybe it's irrelevant?

Data types in computers

In programming, there are various ways to use the data in the computer: integers, floating point numbers, Boolean, strings of characters, etc. A data type is just a way to interpret the binary data.

For example, let's say the data stored in a certain location in the computer is 0100 0001 0000 0101. We could interpret it as a single 16-bit number - 16645 - or two 8-bit numbers - 65 and 5 - or two ASCII characters - A and the Enquiry character. It is up to the programmer to decide how to interpret the data.

Here is a short list of common data types:

Names Size (bits) range
Boolean 1* 0 to 1
Byte, char, octet (Signed) 8 -128 to 127
Byte, char, octet (Unsigned) 8 0 to 255
Word, short (Signed) 16 −32768 to 32767
Word, short (Unsigned) 16 0 to 65535

* Programming languages usually don't use a single bit to represent boolean values.

Number representation - Negative numbers?

We have seen how positive numbers are represented in binary, now let's take a look at how negative numbers are represented.

The processor uses the most significant (leftmost, also called the sign bit) bit of a number to indicate the sign: 0 for positive numbers and 1 for negative numbers. To negate a number, we use the two's complement operation: we 'flip' each bit (0 becomes 1 and 1 becomes 0, this is what we call a logical NOT) and add 1 (See Binary arithmetic for more information).

Let's take the number 5 for example (16-bit):

    0000 0000 0000 0101   # The two's complement (signed) value is  5, the unsigned value is 5
    1111 1111 1111 1011   # The two's complement (signed) value is -5, the unsigned value is 65533
    ^
    The sign bit is set

We can see that the same data can be represented by a positive number and a negative number at the same time, it is up the programmer to decide to interpret it as a signed number or an unsigned number.

Binary arithmetic

Addition

It is quite simple to add two binary numbers together using the carry method:

1 + 0 = 1
0 + 1 = 1
1 + 1 = 0, carry 1

When the value of a bit exceeds 1, the excess needs to be carried out of this bit into the left bit (This is an important concept for the overflow and carry flags later on in the tutorial)

So a 16-bit addition in the CPU looks like this:

                 111    < Carried bits
  0000 0000 0000 0101
+ 0000 0000 0000 0011
  -------------------
                 1000

The first (rightmost) bit's excess is carried into the second bit, the second bit's excess is carried into the third bit and the third bit's excess is carried into the fourth bit (We can also say that there is a carry out of the first bit, the second bit and the third bit)

Subtraction

The computer does not implement a way to execute the subtraction operation, they simply negate one of the operands and do an addition.

For example:

5 - 3 = 5 + (-3)


 11111 1111 1111 1 1    < Carried bits
  0000 0000 0000 0101
+ 1111 1111 1111 1101
  -------------------
  0000000000000000010

Note that if there is a carry out of the leftmost bit, the result couldn't fit in the 16-bit destination.

Multiplication TODO Division and remainder TODO Left/Right bit shift TODO

Booleans and Boolean logic

A 'Boolean' is a data type with only two possible values: true (represented by 1) and false (represented by 0). It is very useful in programming for conditional logic (e.g. IF my glass is empty THEN order another beer) and in electric circuits. Here is a table of the boolean operators used in computers

Operator Names Representation Example
NOT Logical negation ! !0 = 1
AND Logical and & 0 & 1 = 0
OR Logical inclusive or | 0 | 1 = 1
XOR Logical exclusive or ^ 0 ^ 1 = 1

Logical negation (NOT) The logical negation simply inverts (or flips) the given binary value.

Truth table:

A !A
0 1
1 0

Logical and (AND) The result of a logical and is true if both of the operands are true.

Truth table:

A B A & B
0 0 0
0 1 0
1 0 0
1 1 1

Logical inclusive or (OR) The result of a logical or is true if one or both of the operands are true.

Truth table:

A B A | B
0 0 0
0 1 1
1 0 1
1 1 1

Logical exclusive or (XOR) The result of a logical and is true if one of the operands are true, but not both.

Truth table:

A B A ^ B
0 0 0
0 1 1
1 0 1
1 1 0

The same concepts apply for binary numbers:

     0000 1010 1111 1111
AND  1000 1011 0111 0101
     -------------------
     0000 1010 0111 0101

NOT  0011 1111 0000 1010
     -------------------
     1100 0000 1111 0101

Computers and code

Basic computer architecture

Here is a basic diagram of the system design of a modern computer We can see that all there is a lot of I/O (input and output) hardware connected to the CPU, but we don't really have to worry about the exact way to access these from within the CPU because it is taken care of by the operating system (Linux, Windows...). In the real world, programmers use system calls which are understood by the operating system to request access to certain files on the hard drive or to display text on the screen. Only the CPU, memory and registers (in green) are relevant for the game.

The central processing unit (CPU) is the 'brain' of the computer. It is capable of reading and writing data to the memory and the registers.

The memory is a long array of locations accessible by the CPU. The registers are easily accessible locations accessible by the CPU. We will see in detail how the CPU interacts with the memory and the registers in the next sections.

Machine code and compilation

Today, almost all of computer software is written using programming languages such as C/C++ and Java. With the use of a compiler or an interpreter, programmers translate computer code (written in plain text, readable by humans) into machine code that the CPU can understand. More specifically, into a list of instructions.

Here's an example using C:

int main(){

    int a = 1;

    if(a < 12){
        return 1;
    }else{
        return 0;
    }

}

Using the GNU Compiler, the plain text C source file is translated into machine code:

...
4855 e589 45c7 01fc 0000 8300 fc7d 7f0b
b807 0001 0000 05eb 00b8 0000 5d00 00c3
...

Using the objdump command on Linux, the machine code is translated human-readable assembly language:

0000000000000000 <main>:
   0:  55                     push   rbp
   1:  48 89 e5               mov    rbp,rsp
   4:  c7 45 fc 01 00 00 00   mov    DWORD PTR [rbp-4],1
   b:  83 7d fc 0b            cmp    DWORD PTR [rbp-4],12
   f:  7f 07                  jg     18
  11:  b8 01 00 00 00         mov    eax,1
  16:  eb 05                  jmp    1d
  18:  b8 00 00 00 00         mov    eax,0
  1d:  5d                     pop    rbp
  1e:  c3                     ret

The machine code produced by the C compiler can be directly translated to assembly language: 00b8 0000 becomes mov eax, 0 and c3 becomes ret. We will see what these instructions do very soon. The compilation process varies greatly from language to language, but the end result must be understandable by the CPU. The instructions (machine code numbers) that the CPU can understand vary depending on the CPU's instruction set. Most modern consumer processors use the x86-64 (or amd64) instruction set. The game's CPU is based on the Intel 8086 processor (x86-16 instruction set)

So, what does a 32-bit processor mean? A 32-bit processor will handle data in 32-bit chunks, and do its calculations with 32-bit numbers. Intel x86 processors are backward compatible, meaning that a 32-bit processor can handle data and do its calculations with 16-bit and 8-bit numbers (That is why the DWORD PTR directive is required at offset 4 and b in the previous example: we are telling the CPU that we want to deal with 32-bit numbers (double word) for this instruction).

That is not the case with the game's CPU, which is a purely 16-bit processor.

Memory and registers

Random-access memory (RAM) is a piece of hardware capable of storing data and accessing it in no particular order (hence the name). You can think of the memory as a pile of blocks, each block having an address and storage for data. In the game, memory is only accessed in 16-bit chunks so each "block" holds 16 bits of data. Here's how it looks:

ram

Each 16-bit block has an address (left).

Registers are individual storage blocks that are located inside the CPU. They are used in some instructions and can be written to or accessed just like memory, but they are referred by their name (A, B, X...) instead of by an address. On modern CPUs, registers are much faster to access than memory.

regs

Each register holds 16 bits of data

Execution cycle

To execute a program, the CPU start by reading (fetching) the instruction located at the address stored in the instruction pointer (IP), then it will decode the instruction and increment the instruction pointer. Finally, it executes the instruction and the cycle starts over. The instruction pointer is a register that always holds the memory address of the next instruction. Let's pretend that we have a program that is compiled in machine code and copied into memory at the address 0x0400. To execute this program, IP is set to 0x400 (We say that the execution jumps to this address), the CPU fetches the instruction at this address in memory, then it is decoded and executed and the cycle starts over. The program can jump to other addresses depending on certain conditions using a jump instruction.

This is normal if this seems very abstract, but don't worry, there will be practical examples in the next section.

Assembly language

So we have seen how programmers can use programming languages to write computer programs, and we have seen how these programs are handled by the CPU, now let's take a look at how to write assembly code from scratch.

Instruction and operands

Let's start with some an example:

MOV A, 1
MOV B, 0x1337
HWI 9
BRK

You can try to execute it yourself by copying it directly into the game editor and uploading it. The code simply displays the number 0x1337 using the Hologram Projector hardware

The first instruction is a MOV (move) instruction. The syntax for the MOV instruction is MOV destination, source and it copies whatever is specified for the source operand in the destination operand, overwriting what was there before the instruction. The HWI instruction is explained more in detail here: Hardware

The first line can be read as "Move 1 into A"

General syntax for each line of the assembly code:

label:    mnemonic  destination, source    ; comment

You can prefix any line with a name and a colon, this will define a label which is a symbolic name for the memory address of the code or data following it. During the assembly process, each mention of a label is replaced by this address and therefore makes them immediate values. We will see labels in more detail in the next section. Each instruction has a mnemonic to help remember it, the move instruction's mnemonic is MOV. The instruction's mnemonic is followed by one or two operands. There are 3 types of operands in the game: memory operand, register operand and immediate value operand.

Immediate value operand: An immediate value operand is simple a base-10 or base-16 number (with 0x prefix). The following immediate operands are valid: 12, -3, 0x0002, 0x03, label_name, +4 and the following immediate operands are invalid: 12.3, -0x02, abcd

Register operand: A register operand is the name of a register. Only the following registers can be used in instructions: A, B, C, D, X, Y, SP, BP, the register FLAGS, IP can't be modified or accessed directly. Register names and mnemonics are not case-sensitive.

Memory Operand: A memory operand is a memory address in between square brackets. An immediate value or a register can be used to specify the address. The following memory operands are valid: [12], [0xFFFE], [-3], [A], [BP], [label_name]. You can also specify a register with a displacement value: [A + 3], [BP-4], [X + label_name]

Take a moment familiarize yourself with the various instructions available.

Let's quickly observe some more complex instructions before moving to the next section. MOV Transfer a word from the source operand to the destination operand. The source operand remains unchanged and the destination operand is overwritten. ADD The destination operand is overwritten by the sum of the two operands

MOV A, 0 ; A = 0
MOV B, 2 ; B = 2
ADD A, B ; A = A + B = 2

MUL Performs unsigned multiplication.

The A register and the source operand are multiplied and the 32-bit result is stored in Y:A.

MOV A, 6
MUL 4

The result of the multiplication (A * 4 = 24) is stored in Y:A. The most significant (leftmost, or higher) word is stored in the Y register, and the least significant (rightmost, or lower) word is stored in the A register. The Y register is not overwritten if the higher word is zero (if the result of the multiplication is lower than 65536).

Y = 0x???? # The value of Y is unchanged
A = 0x0018 # 6 * 4 = 24

One more example:

MOV A, 0x8001
MUL 0x0002
Y = 0x0001
A = 0x0002 # The complete result is Y:A = 0x00010002

If the result is greater than 65536 and Y is overwritten, the carry flag (CF) and the overflow flag (OF) are set, otherwise they are both reset to 0 (More information on the FLAGS register in the next section).

BRK
Sets the break flag, which stops the execution until the next game tick (see execution cycle).

Conditional execution

As we have seen, the CPU always execute each instruction in order, one after the other. But what if we want to execute the same code over and over again in a loop or execute some code only in particular conditions?

First, let's take a look at the JMP instruction (also called unconditional jump):

        MOV A, 0
        JMP myLabel
        ADD A, 1
myLabel:
        BRK

In the above example code, the CPU will execute the MOV instruction, then the JMP instruction: the JMP instruction overwrites the value of the instruction pointer - the register that points to the next instruction to execute - with the value of the label myLabel which is the address of the code following it. So the instruction pointer will point to the next instruction to execute: the BRK instruction.Therefore, the ADD instruction is never executed. We can say the execution jumps to myLabel. If we were to move the label just above the JMP instruction, the CPU would execute the JMP instruction over and over again in an infinite loop.

Conditional jumps Conditional jumps are instructions that only change the IP if a condition is met, here's a list of conditional jumps:

Mnemonic Summary Jump condition
JGE Jump if greater or equal SF = OF
JG Jump if greater SF = OF & ZF = 0
JLE Jump if less or equal SF != OF | ZF = 1
JL Jump if lower SF != OF
JNS Jump if not sign SF = 0
JS Jump if sign SF = 1
JNZ Jump if not zero ZF = 0
JZ Jump if zero ZF = 1

You can read the FLAGS section on the CPU page to know exactly in which condition the flags are set.

Let's say that we wanted to check if the value of the register A was equal to 5, we could do this:

SUB A, 5
JZ some_label

This above code can be read as "Subtract 5 to A, Jump to some_label if the result of the last instruction was 0". This would work, but whatever was stored in the A register would be modified no matter what, and we might to keep that value for some other calculations. Luckily for us, the CMP (Compare) instruction exists and is the same as the SUB instruction except that it doesn't store the result anywhere: it only modifies the FLAGS register.

Let's say that we want to check if the third bit of the A register was set:

AND A, 0x0004 ; = 0000 0000 0000 0100
JNZ some_label

If the result of the AND instruction different from 0, that means that the third bit of the A register is set. The TEST instruction is the same as the AND instruction except that it does not store the result anywhere. There is almost always a TEST or a CMP instruction preceding a conditional jump.

The stack, procedures and the stack frame

In programming languages, we use procedures or functions to organize the code. Functions are a very important concept to master when reading or writing assembly language. This part can be quite hard to understand at first but it is very rewarding to be able to understand how real-world programs works on the lowest level.

This is the basic structure of a function in most programming languages:

  • The function declaration
    • Declare a return type
    • Declare the parameters of the function
  • Local variable declaration
  • Statements
  • Return statement

And this is how it looks in C:

int doSomeMath(int a, int b){ //Function declaration
    int result; //Local variable declaration

    //Some statements
    result = 0;
    result = result + a;
    result = result - b;
    result = result + 1;
   
    //The return statement
    return result;
}

Let's go through this example line by line. The first line is the function declaration, we are telling the C compiler that the following block of code is our function called doSomeMath, and that it accepts two arguments of int type: a and b. We are also telling the compiler that the return type of the doSomeMath function is int.
In the second line (int result;) we are declaring one local variable of type int called result. The local variables are stored in memory, so the compiler needs to know how much space to reserve for each variable.
The following 4 lines are simple statements that can each be translated into single instructions. The last statement (return result) indicates that we want to return to the callee (The code that called this function) the value inside the result local variable.

Let's see how it translates in assembly language:

; Function doSomeMath - Does some math
; Usage:
; word doSomeMath(word a, word b);
; Modifies: A
doSomeMath:
    push BP             ; Save previous stack frame
    mov BP, SP

    sub SP, 1           ; Reserve space for one 16-bit block of memory
    
    mov [BP-1], 0       ; result = 0
    add [BP-1], [BP+2]  ; result = result + a
    sub [BP-1], [BP+3]  ; result = result - b
    add [BP-1], 1       ; result = result + 1

    mov A, [BP-1]       ; The return value is stored in the A register

    mov SP, BP          ; Undo the reservation of space
    pop BP              ; Restore the previous stack frame
    ret

To understand this example, we must first understand what is the stack. The stack is a data structure with a Last In First Out (LIFO) system: the last inserted item goes out first. We can imagine the stack as a stack of plates in a restaurant: plates are taken from the top of the stack and new plates are placed on the top of the stack. We can't insert or remove data that is under other pieces of data, but we can modify it.

We use the term push when we add a piece of data on top of the stack and pop when we remove a piece of data from the stack.

Visual representation of the stack:

Visual representation of the stack

We can notice that this implementation of the stack is actually upside-down compared to a real-world stack of items: the stack grows downwards in memory, towards lower addresses. The important thing to remember is that the bottom of the stack is at a fixed address and that the top of the stack changes when an item is pushed on the stack or popped out. In the game's CPU, the stack pointer (SP) register always points to the top of the stack and the base pointer always points to the base of the current stack frame.

Clone this wiki locally