At a recent lunch troika ofMSJ columnists (Paul DiLascia, John Robbins, and me), we were commenting on how so few of today's programmers are skilled in what was essential knowledge just a few years ago. For instance, we all agreed that many programmers lack even a basic understanding of assembly language. In the idealized world presented by most language vendors, coding is so easy that there are no bugs to speak of. And if there ever was a bug, you'd certainly be able to find it easily. No need to resort to messy instruction-by-instruction code slogging, no sir. Contrast that utopian vision with your own experience. How many times have you been in your debugger stepping through somebody else's code in assembly language because there's no source available? This is especially annoying when some third-party component blows up and you're assigned to track down the problem. Even when debugging your own code, knowing a little assembly language can help you figure out why your high-level language code isn't working the way you think it should. Just put the debugger into mixed source/assembly mode and observe how the compiler translated your code into machine instructions. Paul DiLascia observed that there's a big difference between programming in assembler and knowing just enough to get by in a pinch while debugging. He jokingly suggested an "assembly language survival guide" that would cover just enough to debug the most common situations. Sounds like a darn good idea to me, so this column presents "Matt's Just Enough Assembly Language to Get By." Think of it as a cram course in Intelx86 assembly language, with all of the esoteric stuff omitted. Afterward, I'll show the assembler code for a typical procedure, and show how its operations can be inferred by the instructions I've covered. Before jumping into the various instructions and instruction sequences, let me add a couple of prefaces and warnings. First, I'm going to describe only 32-bit Intel code. If you're still stuck programming in 16-bit land, my sympathies. Second, different compilers from different vendors generate different code. However, what I describe here should apply to all compilers (including Visual Basic® 5.0 when generating native code.) Third, don't be surprised if you encounter instructions and instruction sequences that aren't mentioned below. Most compilers use only a small fraction of the instruction set available to them (at least on the Intel platform). But many compilers support inlining of raw assembly language. This allows assembly language gurus to use CPU instructions that the compiler isn't aware of. An inline assembler may be used to optimize a particular sequence, or it may be used to get at CPU-specific instructions such as the timers available on Pentium-class CPUs. In addition to inline assembly code, don't forget that programmers sometimes write entire source modules in assembly language—hard to believe, isn't it? Just as most 32-bit compilers use only a small fraction of the available instructions, they also use only a subset of the registers of the CPU. Since so much of what I'll describe depends on the registers, a quick review of the commonly used Intelx86 register set is in order. In Figure 1, all registers are 32 bits except where noted. "Multipurpose" means the register can hold any arbitrary 32-bit value (for example, literal values, addresses, and bit flags). In addition to being familiar with the registers, it's essential to understand how instruction arguments are used. With the exception of a few obscure cases, all instructions take zero, one, or two arguments. Instructions that take zero or one arguments don't require explanation. For instructions that take two arguments, the first argument is usually the destination, while the second is the source. For example, the "ADD EAX,ESI" instruction adds the contents of the ESI (the source) to EAX. The result is stored in EAX (the destination). Put another way, the first argument is the one that's modified as a result of the instruction. A basic knowledge of how instructions reference memory is also vital. Some instructions implicitly reference memory. For example, PUSH EAX pushes the current value of the EAX register onto the stack. Where's the stack? It's whatever the ESP register is currently pointing to. Likewise, instructions like SCASB require that the ESI and/or EDI registers contain the address of the memory location you want to use. Other instructions use arguments to explicitly state the address to be used. You can usually tell this by the presence of square brackets in the instruction. For example, "MOV EBX,[00401234]" reads from the address 0x00401234. Another form of addressing uses registers and possibly offsets. For example, in "MOV EBX,[ECX]", the ECX register contains an address (also known as a pointer by C++ users). The instruction "MOV EBX,[EBP+8]" reads from the address calculated by adding 8 to the contents of the EBP register. Intel CPUs have a very formal definition for allowable forms of instruction addresses. It's complex enough to make most people's heads swim. If you know what a modR/M byte is, or know how S-I-B addressing works, then you already know more than this column can teach you. In the "Just Enough to Get By" guide, the preceding paragraph should be enough. With the theory part over with, let's now look at the most common instructions and instruction sequences. I've grouped them into several categories rather than sorting them alphabetically. As you'll see, some instructions are used in multiple categories.
Procedure Entry and Exit
These instructions are automatically inserted by the compiler to create a standard method for accessing parameters and local variables. This method is called a stack frame, as in "frame of reference." In fact, the Intel CPU dedicates the EBP register to maintaining a stack frame. For this group of instructions, it's especially important to note that not every procedure will use exactly the same sequence, and that certain things may be omitted entirely. Sequence PUSH EBP / MOV EBP,ESP / SUB ESP,XX Purpose Sets up the EBP stack frame for a new procedure Examples PUSH EBP
MOV EBP, ESP
SUB ESP, 24 Description "PUSH EBP" saves the previous frame pointer on the stack. "MOV EBP,ESP" sets the EBP register to the same value as the stack pointer (ESP). "SUB ESP,XX" creates space for local variables below the EBP frame. In optimized code, you may see this sequence interspersed with other instructions (for example, "PUSH ESI"). Since "PUSH EBP" and "MOV EBP,ESP" both use the EBP register, a processor with multiple pipelines would ordinarily need to stall one of the pipelines. By interspersing other instructions that don't use the EBP register, the processor can do more work in the same amount of time. Instruction ENTER Purpose Sets up the EBP stack frame for a new procedure Examples ENTER 8, 0 ; Sets up stack frame with
; 8 bytes of local variables
Description The ENTER instruction first became available on the 80286 processor. It was intended to replace the "PUSH EBP / MOV EBP,ESP / SUB ESP,XX" sequence with a single, smaller instruction. On current processors the ENTER instruction is slower than the three-instruction sequence, so ENTER is rarely used. Sequence MOVE ESP,EBP / POP EBP Purpose Removes the EBP stack frame before leaving a procedure Description The "MOV ESP,EBP" instruction bumps up the stack pointer past any space allocated for local variables on the stack. "POP EBP" restores the stack frame pointer to point at the previous EBP frame. This sequence is normally followed by a return instruction to return control to the calling procedure. Instruction LEAVE Purpose Removes the EBP stack frame before leaving Description The LEAVE instruction is the inverse of the ENTER instruction. It can also be used to remove a frame set up by the "PUSH EBP / MOV EBP,ESP" sequence. The LEAVE instruction is only 1 byte long, which is smaller than the longer "MOV ESP,EBP / POP EBP" sequence. Unlike the ENTER instruction, there's no performance penalty for using it, so some compilers use LEAVE. Instruction PUSHregister Purpose Saves the previous values of register variables Examples PUSH EBX
PUSH ESI
PUSH EDI
Description Sometimes compilers use a general-purpose register to hold the value of parameters or local variables. This can be more efficient than storing the same value in memory. These are commonly known as register variables. The EBX, ESI, and EDI registers are most often used as register variables. The convention most compilers use is that register variable values are preserved across procedure calls. If the compiler decides to use register variables in a procedure, it is responsible for preserving the value of the registers that it alters (typically, EBX, ESI, and EDI). Typically, compilers preserve these register values on the stack as part of setting up the procedure's stack frame. If the compiler uses only one or two of the aforementioned registers, it needs to preserve only those registers. Instruction POPregister Purpose Restores the previous values of register variables Examples POP EDI
POP ESI
POP EBX
Description In preparing to return from a procedure, the register variable registers need to be restored to their previous values. These instructions remove a value from the stack and place it into the designated register.
Accessing Variables
The Intel CPU has many instructions that work with variables, which are just locations in memory. For example, you can add or subtract from a variable representing a counter. Likewise, a variable may contain a pointer to something. There are just too many instructions to describe here, and in most cases the instruction name gives a good clue about what the instruction is doing. However, I will show how variables of different storage classes appear in assembly language. Instructioninstruction [global] Purpose Global/static variables Examples MOV EAX,[00401234]
MOV [00401238],ESI
PUSH [77852432]
ADD [00620428],00001000
Description When you see an instruction that includes an actual machine address inside the square brackets, it's accessing memory that was declared as either a global or static variable. These addresses are known at program load time, so the instruction contains the actual memory address to read or write. Instructioninstruction [parameter] Purpose Procedure parameters and this pointers Examples MOV ESI,[EBP+14]
MOV [ESP+30],EAX
ADD [EBP+0C],2
OR [ESP+20],00000010 Description Parameters to procedures are usually passed on the thread's stack. Since these values are pushed before the procedure call and before the called procedure sets up its stack frame, the parameters appear at positive offsets from the stack frame base pointer (EBP). Just about any instruction that makes reference to memory above EBP (for example, "[EBP+8]") is making use of a procedure parameter. The advantage of using EBP for accessing parameters is that EBP doesn't change throughout the lifetime of a procedure. This makes it easier to keep track of the procedure's parameters. Prior to the 80386, the only effective way to access parameters was with the base pointer register. The 386 added the ability to access memory just as easily with displacements from the stack pointer (ESP) register. Thus, optimized code can dispense with setting up an EBP frame and still reference parameters by using positive offsets from ESP. For example, "ADD [ESP+20],4" adds four to whatever DWORD is at [ESP+20]. From a debugging standpoint, using ESP to access parameters is inconvenient. Since ESP can change during a procedure, a given parameter may be at different offsets from ESP at different points in a procedure's code. One last word on parameters. In C++, the this pointer of a member function is really a hidden parameter. Usually the this pointer is the last parameter pushed on the stack before the call. In Visual Basic, the self-referential me is the same thing as the C++ this pointer. Instructioninstruction [local] Purpose Local Variables Examples MOV ESI,[EBP-14]
MOV [EBP-30],EAX
SUB [ESP],2
AND [ESP+4],00000010 Description From the vantage point of an assembly instruction, local variables aren't much different than parameters when an EBP frame is used. The only distinction is that local variables are at negative offsets from the EBP stack frame. You can get an idea of how big the sandbox for local variables will be by examining the "SUB ESP,XX" instruction near the beginning of the procedure. Things do get messy when the compiler decides to omit an EBP frame. When this happens, the compiler addresses both local variables and parameters as positive offsets from the ESP register. There's no good way to tell a local apart from a parameter in this situation except to find out how much space the procedure has allocated for locals (see above). If the offset is less than the space allocated, it's a local. Otherwise, it's probably a parameter. InstructionLEA variable Purpose Load Effective Address Examples LEA EAX,[ESP+14]
LEA EDX,[EBP-24] Description Despite the square brackets, LEA doesn't actually read memory or dereference a pointer. Instead, it loads the first operand with an address specified by the second parameter. For example, "LEA EAX,[ESP+14]" takes the current value of the ESP register, adds 14 to it, and puts the result in EAX. LEA's primary use is to obtain the address of local variables and parameters. For example, in C++, if you use the & operator on a local variable or parameter, the compiler will likely generate an LEA instruction. As another example, "LEA EAX,[EBP-8]" loads EAX with the address of the local variable at EBP-8. A less obvious use of LEA is as a fast multiplication. For example, multiplying a value by 5 is relatively expensive. Using "LEA EAX,[EAX*4+EAX]" turns out to be faster than the MUL instruction. The LEA instruction uses hardwired address generation tables that makes multiplying by a select set of numbers very fast (for example, multiplying by 3, 5, and 9). Twisted, but true.
Calling Procedures
InstructionCALL location Purpose Transfer control to another procedure Examples CALL 00682568
CALL [00401234]
CALL ESI
CALL [EAX+24] Description The CALL instruction doesn't need much explanation in itself. It pushes the address of the instruction following it onto the stack, then transfers control to the address given by the argument. The various ways of specifying a target address are worth mentioning, however. The simplest form of the CALL instruction is when the argument contains the destination address as an immediate value (for example, "CALL 00682568"). This type of call is almost always to another location within the same module (EXE or DLL). Slightly more complicated is when the CALL instruction indirects through an address (for example, "CALL [00401234]"). You'll see this form of CALL instruction when calling a function imported from another module. It's also seen when calling through a function pointer stored in a global variable. Two other forms of CALL instruction use registers as part of their address. If just a register name is specified (for example, "CALL ESI"), the CPU transfers to whatever address is in the register. If a register is used within brackets, perhaps with an additional displacement ("CALL [EAX+24]"), the instruction is calling through a table of function addresses. Where would these come from? You may know these tables by the more familiar name of vtables. In the preceding instruction example, the sixth member function is being called. (24 divided by the size of a DWORD is 6.) InstructionPUSH value Purpose Places a parameter onto the stack in preparation for calling procedure Examples PUSH [00405234] ; Push a global variable
PUSH [EBP+C] ; Push a parameter
PUSH [EBP-14] ; Push a local variable
PUSH EAX ; Push whatever is in EAX
PUSH 12345678 ; Push an immediate value. Description When it comes to passing parameters, all variations of the PUSH instruction are used by the compiler. Global variables, local variables, parameters, the results of a calculation, and immediate values can all be passed with a single instruction. When you see a sequence of PUSH instructions prior to a CALL instruction, the odds are good that the PUSHes are putting the parameters onto the stack. As mentioned earlier, if a member function or method is being called, the this or me pointer is usually passed last. In some cases, the this pointer is passed in the ECX register instead. You can identify when this occurs by looking for code that initializes the ECX register and then does nothing with it before the CALL instruction. InstructionRET Purpose Return from a procedure call Examples RET
RET 8 Description The RET instruction returns from a procedure call. It simply pops whatever value is currently at [ESP] into the EIP (instruction pointer) register. The "RET XX" form does the same thing, and then adds XX to the ESP value. This is how __stdcall procedures clear parameters off the stack before returning to their caller. (Most Win32® APIs are __stdcall based.) By dividing the number of cleared bytes by four (the size of a DWORD), you can usually figure out how many parameters a procedure takes. For instance, a procedure that returns with a "RET 8" instruction takes two parameters. Functions that return an integer or pointer value usually return the value in the EAX register. By examining what's in EAX before executing the RET instruction, you can see the function's return value. InstructionADD ESP, value Purpose Removes parameters off the stack Examples ADD ESP,24 Description When calling procedures that don't remove parameters before returning, it's up to the calling function to remove its parameters. This is the case with cdecl functions, which is the default for C and C++ code. The "ADD ESP,XX" function bumps up the stack pointer so that any passed parameters are below the resulting ESP. If the function doesn't take a variable number of parameters, the "ADD ESP,XX" instruction gives insight to how many parameters the called procedure accepts. (See the description above for "RET XX".) If the called procedure takes a variable number of parameters (like printf and wsprintf do), the "ADD ESP,XX" instruction tells you how many parameters were passed for that particular CALL.
Flow Control
In the context of this column, flow control means code that affects which portions of a program's code are subsequently executed. At the simplest level, this means conditional execution (colloquially known as if statements). More complex flow control sequences such as while loops and for statements are usually built from the lower-level if statement constructs. In one case though (the LOOP instruction), the processor has built-in knowledge of these higher-level language constructs. Before I get to these instruction sequences, let me highlight two things that can easily trip you up. For starters, the term "Jcc" is used as a stand-in for any of the 16 conditional jump instructions. The cc means condition code. More insidiously, there are several sets of Jcc instructions that are aliases for one another. For example, JZ (Jump if Zero flag set) is the same instruction as JE (Jump if Equal). Likewise, JNZ (Jump if Zero flag NOT set) is the same instruction as JNE (Jump if Not Equal). Unfortunately, some disassemblers use the JZ/JNZ form, while others use the JE/JNE form. Is this confusing? Yes! The moral of the story: be prepared to mentally substitute an aliased form of the instruction if it makes the code easier to understand. Sequence CMPvalue, value / Jcc location Purpose Compare two values, and branch accordingly Examples CMP EAX,2
JE 10036728
CMP [EBP+20],1000
JNE 00427824 Description The CMP instruction is used when two values are to be compared. The CMP instruction sets or clears a variety of flags, including the Zero, Sign, and Overflow flags. From this, a variety of Jcc instructions can then be used to branch accordingly. Most often, the JE and JNE instructions follow a CMP instruction. The following C++ code sequence would be implemented with a CMP / JNE sequence: |