Assembly Language
- Machine Code
- When a C program is compiled, eventually it is translated into machine code
- Currently x86-64 machine code, named for historical reasons
- Original developed by Advanced Micro Devices (AMD) and was named AMD64
- x86-64 machine code has evolved from 16-bit processors to current 64-bit; backwards compatability has been maintained
- Assembly Language
- In essence, a (somewhat) more readable version of machine code
- Translation to machine code is almost one-to-one (i.e., each assembly language instruction often translates to one machine code instruction)
- Instructions' names are used; no need to memorize the 8-bit encodings
- Likewise for some operands (although hexadecimal is used to indicate addresses)
- x64 Assembly
- x86-64 Architecture
- What your program sees
- Only 2 types of memory: registers and main memory
- Program counter register (called
%rip
in assembly language) keeps track of execution location in main memory (a 64-bit address) - Instruction register contains the current instruction (1-15 bytes on x86-64)
- Machine code instructions are composed of an operation code (opcode) and its operands.
- The first 1-3 bytes encodes the instruction (the op code)
- Instructions are very low-level; e.g., 0x48 is the to op code for an instruction that copies data from one location to another
- Each opcode has a fixed number of operands
- Each operand specified a register or a (virtual) memory address
- Registers
- General-purpose integer register names are left over from when particular registers had specialized purposes
- On 32-bit machines, there were only 8
- Each register can hold an integer or an address (8 bytes = 64 bits)
- Example instruction:
Assembly Machine code movq %rsp, %rbp
5894855
- Register Usage
- Example of assembly language and machine code
int main() { printf("Hello\n"); printf("Goodbye\n"); } memory machine code of corresponding address instruction in assembly language address that address (hex) (hex) 4004d0: 48 83 ec 08 sub $0x8,%rsp 4004d4: bf e8 05 40 00 mov $0x4005e8,%edi 4004d9: e8 da fe ff ff callq 4003b8 <puts@plt> 4004de: bf ee 05 40 00 mov $0x4005ee,%edi 4004e3: 48 83 c4 08 add $0x8,%rsp 4004e7: e9 cc fe ff ff jmpq 4003b8 <puts@plt> addresses are 64 bytes; e.g 0x00000000004004d0 is the address of the first instruction Using the Gnu debugger (gbd), we can inspect memory further: (gdb) x/s 0x4005e8 0x4005e8 <__dso_handle+8>: "Hello" (gdb) x/s 0x4005ee 0x4005ee <__dso_handle+14>: "Goodbye"
- How it runs
./hello
- A process is created (see slytinen406.cdm.depaul.edu)
- An address space is created; always 264 bytes
- The program now thinks it has 264 bytes of memory
- In reality, part of this address space may be on disk, in main memory, and/or in cache memory. Much of it will not be anywhere.
- The starting address of the program in the address space is loaded into the CPU's program counter (%rip)
- In the example above,
0x00000000004004d0
- Based on the contents of the PC, The first (next) instruction is copied from memory into the instruction register
- The program counter is incremented by the appropriate number (1-15, depends on the length of the instruction)
- The instruction in the instruction register is executed. Some instructions modify the PC (to make loops, call other functions, etc.)
- Repeat steps 3-5
- In the example above,
- A process is created (see slytinen406.cdm.depaul.edu)
- Data formats
- Assembly language has:
- Bytes (8 bits)
- A word is 2 bytes
- A double word is 4 bytes
- A quad word is 8 bytes
- Most instructions can operate on bytes, words, double words, or quad words
C declaration Data type AMD64 suffix Size (bytes) char Byte b 1 short Word w 2 int Double word l 4 long Quad word q 8 All pointers Quad word q 8 - Note suffixes for different data types
- Example:
mov
instruction has variants, depending on # of bytesmov
copies data (misnomer) from one location (memory or register) to anothermovb
,movw
,movel
, andmovq
- Assembly language has:
- Assembly Language Example
- C Code: Add two signed integers
long add(long x, long y) { return x + y; }
- Assembly
.p2align 4,,15 .globl add .type add, @function add: .LFB0: .cfi_startproc movq %rsi, %rax addq %rdi, %rax ret .cfi_endproc .LFE0: .size add, .-add .ident "GCC: (GNU) 4.4.7 20120313 (Red Hat 4.4.7-16)" .section .note.GNU-stack,"",@progbits
Sort of like
%eax = %esi %eax += %edi return %eax
- Inspecting the executable with
gdb
(gdb) disas add Dump of assembler code for function add: 0x00000000004005d0 <+0> mov %rsi,%rax 0x00000000004005d3 <+3> add %rdi,%rax 0x00000000004005d6 <+6> retq End of assembler dump. (gdb) disas main Dump of assembler code for function main: 0x0000000000400580 <+0> sub $0x18,%rsp 0x0000000000400584 <+4> mov $0x4006d8,%edi 0x0000000000400589 <+9> callq 0x400460 <puts@plt> 0x000000000040058e <+14> lea 0x8(%rsp),%rsi 0x0000000000400593 <+19> mov %rsp,%rdx 0x0000000000400596 <+22> mov $0x4006e8,%edi 0x000000000040059b <+27> xor %eax,%eax 0x000000000040059d <+29> callq 0x400480 <__isoc99_scanf@plt> 0x00000000004005a2 <+34> mov (%rsp),%rsi 0x00000000004005a6 <+38> mov 0x8(%rsp),%rdi 0x00000000004005ab <+43> callq 0x4005d0 <add> 0x00000000004005b0 <+48> mov $0x4006ef,%edi 0x00000000004005b5 <+53> mov %rax,%rsi 0x00000000004005b8 <+56> xor %eax,%eax 0x00000000004005ba <+58> callq 0x400450 <printf@plt> 0x00000000004005bf <+63> add $0x18,%rsp 0x00000000004005c3 <+67> retq
- C Code: Add two signed integers
- Compilation
gcc
can take a combination of .c, .s, and .o fileslinux> gcc -o add.s add.c -S -O2 linux> gcc -o add addmain.c add.s -O2 linux> ./add Type 2 integers 10 20 30
- Special registers
- Generally not used for general purpose
%rip
: instruction register. Almost never seen in assembly language code.%rsp
: stack pointer. Almost always used in one particular way (more later)%rbp
: base pointer. Almost always used in one particular way (more later)
- Argument registers
- Used for parameter passing, but may also be used for general purposes
- Up to 6 parameters
- More than that, remaining parameters are passed on the program stack
- Return register
%rax
- Common instructions
In 2-operand instructions, 1st operand is source, 2nd operand is destination. In 1-operand instructions, operand is destination.
Instruction # operands Meaning mov
2 Copy data from src to dest add
2 Add 2 items; sum placed in dest sub
2 Subtract src from dest; diff placed in dest imul
2 Multiply dest and src; product placed in dest cmp
2 Compare dest and src; condition code registers are set inc
1 Increment dest dec
1 Decrement dest neg
1 Negate not
1 Bitwise not and
2 Bitwise and or
2 Bitwise or xor
2 Bitwise xor shl
2 Left shift sar
2 Arithmetic Right shift shr
2 Logical Right shift lea
2 Load effective address (like &) - Examples
int same(int x) { return x; } same: movl %edi, %eax ret
int add(int x, int y) { int s = x + y; return s; } add: movl %rsi, %eax addl %rdi, %eax ret
long quadruple(long x) { return x * 4; } quadruple: .LFB0: leaq (,%rdi,4), %rax # %rax = x * 4 ret
int negative(int x) { return -x; } negative: movl %edi, %eax negl %eax ret
long absolute_value(long x) { if (x >= 0) return x; else return -x; } absolute_value: .LFB0: movq %rdi, %rdx # %rdx = x sarq $63, %rdx # shift %rdx right arithmetic -- why? movq %rdx, %rax # %rax = %rdx xorq %rdi, %rax # exclusive or : x ^ %rax subq %rdx, %rax # %rax -= %rdx ret
set
instructions- Usually occur immediately following
cmp
- Sets dest based on
condition code
registers (set bycmp
)sete equal
setne not equal
setg greater
setge greater or equal
setl less
setle less or equal
- Usually occur immediately following
- Examples
int gt(int x, int y) { return x > y; } gt: xorl %eax, %eax # %eax = 0 cmpl %esi, %edi # arg1 > arg2? %edi > %esi? setg %al # store the result of the comparison in %al ret
int istwice(int x, int y) { return x * 2 == y; } istwice: .LFB0: .cfi_startproc addl %edi, %edi # x *= 2 xorl %eax, %eax # %eax = 0 cmpl %esi, %edi # compare y and x sete %al # If equal, set %al to 1 ret
- Exercises
What do these functions do?
f1: movl %edi, %eax xorl $1, %eax andl $1, %eax ret
f2 cmpl %edi, %esi movl %edi, %eax cmovle %esi, %eax ret
f3: subl $97, %edi xorl %eax, %eax cmpb $25, %dil setbe %al ret
- Memory addressing
- Finite number of registers
- Eventually, "main memory" must be used to store working data
- Virtual address space
- Difficult to demonstrate with simple C programs, unless compilation is not optimized
int add(int x, int y) { int s = x + y; return s; } pushq %rbp # next week movq %rsp, %rbp # next week movl %edi, -4(%rbp) # local variable a, stored in memory location -4(%rbp) movl %esi, -8(%rbp) # local variable b, stored in memory location -8(%rbp) movl -4(%rbp), %eax # compute a + b addl -8(%rbp), %eax # in %eax leave ret
Code is more like
int add(int x, int y) { int a = x; int b = y; return a + b; }
- Operand types
- Immediate: constant value in decimal or Hex; number preceded by $
- Register: starts with %
- Memory reference
- several different ways to specify an operand's memory address
- Absolute: give memory location (rarely used)
- Indirect: specify a register; it contains the memory location
- Base + displacement: specify a register, add a value to its address (like pointer arithmetic)
- Indexed: specify 2 registers, or 2 registers + a constant
- Scaled indexed: multiply by the scale s
- Some examples with
movq
- Indexed Addressing Modes
- Memory operands can take many forms
- Most General Form
D(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]+ D]
- Usually not all 3 items in parentheses
- Examples:
(%rbp) Item at memory address stored in register %rbp -4(%rbp) Item at (memory address stored in register %rbp) - 4 (%rbp,%rdx) Item at memory address computed by adding contents of %rbp and %rdx %rdx contents are not an address 8(%rbp,%rdx) Item at memory address computed by adding contents of %rbp and %rdx + 8 8(%rbp,%rdx,4) Item at memory address computed by adding contents of %rbp and (%rdx * 4) + 8
- Memory operands can take many forms
- C examples
- Assuming relevant data is in memory, not a register
Data type C example Instructionexample int int x = 3; movq $3,-4(%rbp) int [] y[i] = 0; movq $0,-32(%rbp,%rdx,4) int * p = &x; leaq -4(%rbp),%rdx - Scale is dependent on datatype size
- Assuming relevant data is in memory, not a register
lea
: Load Effective Address- Loads a memory address but does not retrieve/store data from that address
leaq -4(%rbp), %rdx Store the address computed by subtracting 4 from the contents of %rbp
- Can also be used for arithmetic computations
leaq (,%rdi,4), %rax # %rax = x * 4
does not access memory
- Exercise
- Assume the following values are stored at the indicated memory addresses and registers
Operand Value 0x100 0x000000FF 0x104 0x000000AB 0x108 0x00000013 0x10C 0x00000011 Register Value %rax 0x0000000000000100 %rcx 0x0000000000000001 %rdx 0x0000000000000003 - Fill in the following table showing the values for the indicated operands:
Operand Value %rax $0x108 (%rax) 4(%rax) 9(%rax, %rdx) 256(%ecx, %rdx) 0xFC(, %ecx, 4) (%rax, %rdx, 4) - Fill in the following table showing the effect the instructions below. Assume values in memory and registers as specified above. Assume these instructions are not sequential.
Instruction Destination New value in
destinationmovq %rax, (%rax)
addl 4(%rax), %ecx
subq %rdx, (%rax, %rcx, 4)
movq $-1, 4(%rax)
movzbq $0x61, 4(%rax,%rcx,4)
movsbq $-1,%rcd
jmp
instruction and related- Usually based on immediately preceding
cmp
jmp Unconditional jump je Jump if equal jne Jump if not equal jg Jump if greater (dest > src) jge Jump if greater or equal (dest ≥ src) jl Jump if less jle Jump if less or equal
- Usually based on immediately preceding
- Flow of control
- No
if...else
or looping constructs in Assembly language - Instead, it uses
set
andtest
for effect of simpleif...else
, andjmp
for more complex control structures - Similar to
goto
statement in C
- No
- Example of
goto
in C#include void print_equal(int x, int y) { if (x != y) { printf("Not equal\n"); goto done; } printf("Equal\n"); done: return; } .LC0: .string "Not equal" .LC1: .string "Equal" .text .p2align 4,,15 .globl print_equal .type print_equal, @function print_equal: .LFB11: .cfi_startproc cmpl %esi, %edi je .L2 movl $.LC0, %edi jmp puts .p2align 4,,10 .p2align 3 .L2: movl $.LC1, %edi jmp puts
- Example: scale.c
void scale(char s1, char *s2) { if (s1 == 'F') *s2 = 'C'; else if (s1 == 'C') *s2 = 'F'; } scale: cmpb $70, %dil # s1 == 'F' je .L6 # Yes? goto L6 cmpb $67, %dil # s1 == 'C' je .L7 # Yes? goto L7 ret .L7: movb $70, (%rsi) # *s2 = 'F' ret .L6: movb $67, (%rsi) # *s2 = 'C' ret
- Loop example
int strlen406(char s[ ]) { int i=0; while (s[i] != '\0') i++; return i; } strlen406: xorl %eax, %eax # %eax = 0 .L6: movzbl (%rdi), %edx # %edx = *s addl $1, %eax # %eax += 1 testb %dl, %dl # Test to see if *s is 0? addq $1, %rdi # s += 1 jne .L6 # *s == 0? If not, goto L6 ret
- Another loop example
f: testl %edi, %edi movl $1, %edx movl $1, %eax jle .L3 .L6: imull %edx, %eax addl $1, %edx cmpl %edx, %edi jge .L6 .L3: ret
- Some reverse engineering examples
See this assembly language. What does each function do?
- Example with pointers
- Program stack
- Function environments are stored on a portion of the address space called the program stack
Text, p.190: "By convention, we draw stacks upside down, so that the "top" of the stack is shown at the bottom." My diagrams do not follow this convention. The top of the stack in my diagrams are at the top.
- When a function is called, it creates a stack frame when necessary
- The stack grows and shrinks as functions are called and return.
- Function environments are stored on a portion of the address space called the program stack
- Stack frame organization
- Compiler optimization eliminates the need for stack frames for some functions
- push and pop
- When a function is called, it forms its own stack frame
- It also needs to store enough inforamtion for the previous stack frame to be restored upon return
f() { ... g() }
Start of g:
push %rbp movq %rsp, %rbp
push %rbp
is a "macro"; short for 2 instructionssubq $8,%rsp movq %rbp,(%rsp)
pop %rbp
is short formovq (%rsp), %rbp addq $8, %rsp
- Example
int f(int x) { return g(x); } int g(int y) { return y * 2; }
Assembly language for
g
Compiler generates Equivalent g: pushq %rbp subq $8, $rsp movq %rbp, (%rsp) movq %rsp, %rbp movq %rsp, %rbp subq $16, %rsp movl %edi, -4(%rbp) ... movl -4(%rbp), %eax addl %eax, %eax addq $16, %rsp leave movq (%rsp), %rbp ret
- Creation of new stack frame
- popq
- Pops the program stack
popq %ebp
is equivalent to
movq (%esp), %ebp addq $8, %esp
Compiler generates Equivalent g: pushq %rbp movq %rsp, %rbp subq $4, %rsp movl %edi, -4(%rbp) movl -4(%rbp), %eax addl %eax, %eax addq $4, %rsp addq $4, %rsp leave movq (%esp),%ebp ret addq $8, %rsp ret
- Pops the program stack
call
andret
- Also macros;
callq 0x400f01
meanspushq %rip jmpq 0x400f01
andret
meanspopq %rip jmpq *%rip # this is an indirect jump
- Also macros;
- Function parameters and the program stack
- In x86-64, parameters are usually passed through argument registers
- %rdi, %rsi, %rdx, %rcx, %r8, %r9
- In the unlikely event that more than 6 parameters are passed to a function, they are passed on the program stack
- To illustrate, we will switch to IA-32
int g(int,int); int f() { int a = 1 , b = 2; return g(a,b); } int g(int x, int y) { return x + y; } int main() { f(); } f: pushl %ebp movl %esp, %ebp subl $24, %esp movl $1, -8(%ebp) # local var a movl $2, -4(%ebp) # local var b movl -4(%ebp), %eax movl %eax, 4(%esp) # pass b as parameter movl -8(%ebp), %eax # pass a as parameter movl %eax, (%esp) call g leave ret g: pushl %ebp movl %esp, %ebp movl 12(%ebp), %eax # parameter y movl 8(%ebp), %edx # parameter x addl %edx, %eax popl %ebp ret
- Note that positive offsets/displacements refer to parameters, because of the organization of the program stack
- x86-64 Example
int main() { long x; printf("Value\n"); scanf("%d", &x); x = times4(x); printf("4 * x = %d\n", times4(x)); } long times4(long x) { long total = x; total += times3(x); return total; } long times3(long x) { long total = x; return x * times2(x); } long times2(long x) { return x + x; }
Debugging session
C code int main() { long x; printf("Value\n"); scanf("%d", &x); x = times4(x); printf("4 * x = %d\n", times4(x)); } long times4(long x) { long total = x; total += times3(x); return total; }
gdb inspection (gdb) disas main ... 0x4005ab <+55>: callq 0x4005c8 <times4> 0x4005b0 <+60>: mov %eax,%edx ... (gdb) disas times4 0x4005c8 <+0>: push %rbp 0x4005c9 <+1>: mov %rsp,%rbp 0x4005cc <+4>: sub $0x8,%rsp 0x4005d0 <+8>: mov %rdi,-0x8(%rbp) 0x4005d4 <+12>: callq 0x4005e4 <times3> 0x4005d9 <+17>: add %rax,-0x8(%rbp) 0x4005dd <+21>: mov -0x8(%rbp),%rax 0x4005e1 <+25>: leaveq 0x4005e2 <+26>: retq (gdb) break *times4 (gdb) run Value 10
long times4(long x) { long total = x; total += times3(x); return total; }
Breakpoint 1, 0x4005c8 in times4 () (gdb) print/x $rbp $5 = 0x7fffffffea00 <-- main bottom of stack (gdb) print/x $rsp $6 = 0x7fffffffe9e8 <-- top of stack (gdb) x/x $rsp 0x004005b0 <-- main ret addr (gdb) print/d $rdi $8 = 10 <-- parameter (gdb) cont
long times4(long x) { long total = x; total += times3(x); return total; }
Breakpoint 5, 0x4005d4 in times4 () (gdb) disas ... 0x4005c8 <+0>: push %rbp 0x4005c9 <+1>: mov %rsp,%rbp 0x4005cc <+4>: sub $0x8,%rsp 0x4005d0 <+8>: mov %rdi,-0x8(%rbp) => 0x4005d4 <+12>: callq 0x4005e4 <times3> 0z4005d9 <+17>: add %rax,-0x8(%rbp) ... (gdb) print/x $rbp $9 = 0x7fffffffe9e0 <--- times4 bottom (gdb) print/x $rsp $10 = 0x7fffffffe9d8 <--- top (gdb) break *times4 (gdb) break *times3 Breakpoint 1 at 0x4005e4 (gdb) break *times3+12 Breakpoint 2 at 0x4005f0 (gdb) run Starting program: Value for x 10
long times3(long x) { long total = x; total += times2(x); return total;
Breakpoint 1, 0x00000000004005e4 in times3 () (gdb) print/x $rbp $1 = 0x7fffffffe9e0 (gdb) print/x $rsp $2 = 0x7fffffffe9d0 (gdb) x/x $rsp 0x7fffffffe9d0: 0x004005d9 Breakpoint 4 at 0x4005f0 (gdb) cont Continuing.
long times3(long x) { long total = x; total += times2(x); return total;
Breakpoint 2, 0x00000000004005f0 in times3 () (gdb) print/x $ebp $3 = 0xffffe9c8 (gdb) print/x $rbp $4 = 0x7fffffffe9c8 (gdb) print/x $rsp $5 = 0x7fffffffe9c0 (gdb) x/x $rsp 0x0000000a (gdb) print/x $rdi $8 = 0xa (gdb) disas times2 0x400600 <+0>: lea (%rdi,%rdi,1),%rax 0x400604 <+4>: retq
long times2(long x) { retrurn x + x; }
(gdb) break *times2 Breakpoint 5 at 0x400600 (gdb) break *times3 + 17 Breakpoint 6 at 0x4005f5 (gdb) cont Continuing. Breakpoint 5, 0x0000000000400600 in times2 () (gdb) print/x $rbp $10 = 0x7fffffffe9c8 (gdb) print/x $rsp $12 = 0x7fffffffe9b8 (gdb) x/x $rsp 0x7fffffffe9b8: 0x004005f5 (gdb) x/i 0x004005f5 0x4005f5 : add %rax,-0x8(%rbp)
int times3(int x) { int total = x; total += times2(x); return total; }
(gdb) cont Continuing. Breakpoint 6, 0x00000000004005f5 in times3 () (gdb) print/x $rsp $13 = 0x7fffffffe9c0 (gdb) print/x $rbp $14 = 0x7fffffffe9c8 (gdb) print/d $rax $16 = 20 (gdb) x/x $rbp 0x7fffffffe9c8: 0xffffe9e0 (gdb) x/x $rbp+8 0x7fffffffe9d0: 0x004005d9 (gdb) x/i 0x004005d9 0x4005d9 : add %rax,-0x8(%rbp)
long times4(long x) { long total = x; total += times3(x); return total; }
Breakpoint 3, 0x00000000004005d9 in times4 () (gdb) print/x $rbp $18 = 0x7fffffffe9e0 (gdb) print/x $rsp $19 = 0x7fffffffe9d8 (gdb) print/d $rax $20 = 30 (gdb) cont Continuing. 4 * x = 40
- Recursion
- A recursive function calls itself
- C example: compute n!
int f(int n) { if (n == 1) return 1; else return n * f(n-1) }
- Assembly
f: pushq %rbp .L2: movq %rsp, %rbp movl -4(%rbp), %eax subq $4, %rsp subl $1, %eax movl %edi, -4(%rbp) movl %eax, %edi cmpl $1, -4(%rbp) call f jne .L2 imull -4(%rbp), %eax movl $1, %eax .L3: jmp .L3 addq $4, %rsp popq %ebp ret
- A new stack frame is created each time this function is called (including recursive calls)
- For each call to
factorial
, the Argument 1 register (%rdi) must contain the parameter n - But n changes, so the stack frames for each instance of
factorial
has to allocate 4 bytes in order to store n
- C example: compute n!
- Step-by-step
Call
f(4)
f: pushq %rbp .L2: movq %rsp, %rbp movl -4(%rbp), %eax subq $4, %rsp subl $1, %eax movl %edi, -4(%rbp) movl %eax, %edi cmpl $1, -4(%rbp) call f jne .L2 imull -4(%rbp), %eax movl $1, %eax .L3: jmp .L3 leave ret
- A recursive function calls itself
- Assume the following values are stored at the indicated memory addresses and registers