Programmable devices have long fascinated humans, even before the advent of computers. As long as two centuries ago we had music boxes, tiny mechanisms that produced music encoded as pins on a cylinder. It wasn’t long before it was possible to switch out cylinders and have the same mechanism play new melodies. These aren’t even the oldest programmable devices. Bits may have replaced pins, hard drives may have replaced cylinders, and transistors may have replaced cogs and gears, but the principle of programmable machines remains the same. Essentially it was a device created once, but capable of following instructions that were possibly not even conceived when it was created.
A music box with interchangeable cylinders for different melodies.
You may wonder what this drivel has to do with Assembly, so here goes: Programming Assembly is like sticking individual pins on the cylinder that will later turn the gears that produce music. Before there was assembly, computers needed to be programmed by speaking their language directly with individual 0s and 1s. For instance the sequence 10110111 00001011 might instruct the computer to store the value 11 in a particular part of its memory. For instance, in the above example, the first 4 bits of the instruction (1011) might stand for “move value”, the second set of 4 bits (0111) might be the specific location in memory where the the value is to be moved, and finally the last 8 bits (00001011) are just the binary encoding of the decimal number 11. The first part of the instruction is called the opcode, or operation code. It’s like telling the computer which note to play. What Assembly does is simply give this machine-speak a human face by translating what the computer is doing into instructions that are somewhat more understandable.
An Assembly representation of this instruction might be:
Here ‘ax’ is the name of a memory register in your processor. In a higher-level programming language you might write this as:
var ax = 11
At some level though, the above assembly instruction is the kind of thing that the CPU is finally executing.If you want to, you can take a piece of assembly code, and look up its equivalent binary machine instruction (although usually represented in hex) and you will see exactly what the computer sees when that instruction is run. Of course, the examples we have given above are mostly made up. Those aren’t the exact opcodes and binary representations for these instructions. In fact these instructions aren’t universal, and neither are their opcodes. There is no single Assembly language syntax, since Assembly is closely tied to the architecture of the machine on which it is intended to be run. As we mentioned before, it is essentially machine language put into English letters and symbols.
A knowledge of Assembly therefore requires a knowledge of the machine on which it will run, unlike C or Python code which just needs to be compiled on different platforms to get it running there. This is part of the reason why higher-level languages have almost entirely replaced Assembly code. Assembly language is daunting and there is no getting around that fact.
No matter which programming language you use, the end result is machine code, or assembly language. Knowing something about the end result of your code is of obvious benefit. Think of the difference between knowing how to drive a car, and knowing how an engine, and other components of the car work. It’s possible to know how to drive a car without knowing a thing about spark plugs and brake oil, but understanding how a car works puts you in a better position, especially when the car breaks down. In our case, while debugging software you can now peek at the instructions behind your instructions when you learn assembly. Given how low-level assembly language is, it is also possible to improve performance when coding directly in assembly. However with the complexity of software today it’s infeasible to develop in assembly, especially since such software would need to be rewritten for every platform it is intended to run on. What is commonly done is that, performance-sensitive portions of higher-level code is written in assembly. These are usually sections of the code that are called thousands or even millions of times.
For example, a game might need to constantly evaluate the distance between two points. This piece of code might be called thousands of times in rendering a single frame. It makes sense to write this in assembly and gain a huge boost in performance as a result. This code will need to be rewritten for each platform though. Speaking of platforms and their differences, there are two major schools of CPU design that have a large impact on their assembly language as well. These are Reduced Instruction Set Computing (RISC) and Complex Instruction Set Computing (CISC). The RISC design strategy is to have fewer possible instructions, but have them execute faster. With CISC the aim is to have fewer instructions that are more powerful. For example, a CISC processor might have an instruction to add data in one memory location to data in another memory location. A RISC processor on the other hand would require multiple steps, first instructing the CPU to load data from both memory locations onto registers one by one, and then instruction to add them, and finally an instruction to place this data back in memory. It might seem like CISC is the better bet here, as it seems faster doesn’t it?
You can still buy an 8085 microprocessor training kit that lets you program the CPU by sending it single instructions.
But the fact is, the CISC CPU still has to do all the things the RISC CPU would do. The fact that it exposes them as a single instruction doesn’t change the amount of work behind that instruction. The CISC add operation might end up taking as much time, or even more than the equivalent RISC instructions. The most famous examples of a CISC CPUs are those by Intel and AMD, x86 in other words. RISC is well represented by ARM, MIPS, AVR, PowerPC and most other CPU designs. It is more an accident of history that most desktop and laptop computers use Intel / AMD CISC CPUs while most mobile platforms use ARM or MIPS RISC CPUs. In fact a number of TOP500 supercomputers in the world are of RISC design. The PlayStation 3s Cell processor too was a RISC processor. Over time as the x86 RISC processors have advanced, there has been an increase in the number of instructions, and their capabilities. Commonly performed sequences of instructions have been converted into optimised single instructions. Think of buzz words like MMX, 3DNow!, SSE, SSE2 etc. They are all extensions to the instruction set of x86 CPUs. Many of these are powerful instructions that can efficiently perform repeated operations on larger data sets.
Knowledge of assembly also means you can take advantage new developments before compiler designers have taken them into account, and optimised code output to use them. In fact some special CPU instructions are only available in Assembly! A programmer well versed in assembly is also one with good knowledge of the underlying hardware, and this pays off even when using higher level languages. The way hardware is designed means that often minor changes in code (such as switching the inner and outer loops in a nested loop) can have a huge impact on the speed. Why? How? When? These are the question that can only be answered by learning more about machine architectures. Despite the proliferation of higher-level languages that hide all the complexity we just talked about, there is still great value in Assembly language, and knowing it can be very beneficial even if you don’t ever need to code in it.
The IBM Sequoia is the 3rd ranking supercomputer in the TOP500 list, and is made up of over 1.5 million RISC PoperPC A2 processor cores
fmtStr: db ‘hello, world’,0xA,0
Allocate space on the stack for one 4 byte parameter
; Arg1: pointer to format string
(const char *format, ...);
Pop stack once
Above is the code for a Hello World program written for NASM, and targeting Linux.
Writing something as simple as a Hello World program can be quite a challenge in assembly. The above code cheats slightly by simply calling the C printf function for the bit that actually prints to screen. The rest of the code simply prepares to call printf and pass on the message ‘hello, world’. The ‘section .data’ segment of code we see above instructs the assembler to store ‘hello, world’ in memory and put its location in ‘fmtStr’. After that the sub (subtract) statement, the lea (load effective address) statement, and the mov (move) statements put data and pointers in the right place so the printf function knows where to find them.
This is by no means the only assembly version of Hello World, not even for NASM.
%define a [ebp + 8]
%define n [ebp + 12]
enter 0, 0
mov ecx, 1
mov ebx, ecx
imul ebx, 4
add ebx, a
mov ebx, [ebx]
mov edx, ecx
cmp edx, 0
mov eax, edx
imul eax, 4
add eax, a
cmp ebx, [eax]
mov esi, [eax]
mov dword [eax + 4], esi
mov [eax], ebx
cmp ecx, n
The above code is insertion sort implemented in the NASM dialect of assembly. The same code implemented for a different architecture, such as ARM or MIPS would look completely different.
Tools and Learning Resources
As we have made it abundantly clear, assembly language is highly influenced by the target platform on which it will run. So the best way to get information about assembly language straight from the source, the CPU designers themselves. Whatever hardware platform you are trying to target, just search for the CPU architecture followed by ‘instruction set manual’, and you’re likely to find detailed PDF manuals with everything you’d need to know. For example, search for ‘ARM instruction set manual’ or ‘Intel instruction set manual’. While these manuals are great, they are mostly targeted towards people who already have some knowledge of assembly. Getting into Assembly language is tough, and require a different approach than other languages. Firstly you will need to learn about machine architectures, and then the actual code. It can be hard to figure all this out on your own. Luckily there are many websites available nowadays that offer college course lecture videos, projects, and assignments online and for free. One such great course is titled “The Hardware/Software Interface” and is available on Coursera.org.
At 1500 pages Intel’s Software Developer’s Manuals are the deepest look you can get at their processor architecture and instruction set.
Another great resource of free full courses on Assembly at available here. They have courses on x86, x86-64, and ARM at an introductory, intermediate and advanced level. Instead of jumping straight into Core i7, you should start by looking into programming the 8-bit Intel 8085 processor. It might be nearly 40 years old, but it has a design that you can conceivably hold in your head all at once. From there you can build up to 16, 32, and finally today’s 64bit processors. Once you know a bit of Assembly language, you can start playing around with your own code. For this you will need an Assembler, a piece of software that converts assembly code to machine code. A popular assembler is the Netwise Assembler (NASM) that can run on Windows, OSX or Linux.
The popular Gnu Compiler Collection (GCC) includes support for assembly language, and even even compile higher-level code to Assembly, so you can see what your code looks like in assembly. Visual Studio too ships with an assembler called MASM (Microsoft Assembler). Another way to interact with Assembly code is by disassembling existing code! Decompiling code into its source code is hard to impossible, but getting assembly code for compiled software is easy. On POSIX platforms like OSX / Linux / Unices you can simply type objdump -d /path/to/compiled/binary to get the Assembly representation of compiled code. Finally, if you want to play with running code and see as it executes in Assembly, you have gdb possibly the most powerful debugging tool in existence. To end we’d like to mention that Assembly is a good place to start at a conceptual level, it should be the last place you come to to actually get your hands dirty with code. Come back to this article again once you get familiar with higher level languages that follow.