An assembly language is essentially the language of a CPU. A CPU only understands a simple set of instructions, such as, load some data, store some data, add two numbers etc. When the CPU reads these instructions there are just binary numbers. This would not be convenient for humans, so assembly uses simple short words called mnemonics to represent the instructions. The assembler is the program that runs and converts these mnemonics to actual numerical instructions that the CPU will understand. There is a one-to-one correspondence between the mnemonic instructions of assembly and the instructions of the CPU. So, understanding assembly language is really just understanding how the CPU works.
The architecture of a CPU is what defines the instructions it can understand, the two most common instruction set architectures are ARM and x86. The x86 instruction set architecture was developed by Intel in the 1970s. It went on to dominate the computer market. It was part of the ubiquitous “IBM compatible” standard. Today, most desktop computers, laptops and servers have x89 CPUs. It was only with the rise of tables, smart phones and other similar devices that x86 got a real competitor: ARM. ARM is an architecture designed specifically for small, low power devices and so is ideal architecture for a smartphone. The purpose of this tutorial is to learn x86 assembly, but I hope to include posts about ARM as well.
There are actually quite a few different instructions a typical x86 CPU can understand. These include arithmetic instructions such as add, subtract, multiply and divide, instructions for loading and storing data, instructions for comparing values and instructions that tell the CPU to jump to another instruction address. Jump instructions can either be conditional or unconditional. Conditional jump instructions tell the CPU to move to another instruction address if a certain condition is met. With a little imagination you can construct all kinds of fun programs from these instructions. Unlike higher level languages, there are no if statements, no loops and no variables.
Loading and storing data to main memory is a slow process. To minimise the number of loads and stores a CPU has registers. Registers are very fast memory inside the CPU itself. If the CPU were an industrious little craftsman, main memory would be his store room, and the registers would be his work station. In well written code, the CPU will do as few loads and stores as possible. It will instead keep the data it needs in it’s registers as much as is possible.
The CPU also maintains a set of flags. Every time it performs an operation these flags get set. For example If we performed an addition and the result overflows, the overflow flag is set, or if we perform a subtraction and the result is zero the zero flag is set.
This is basically all we need to know before we start writing assembly!