Assembly Language

 

Assembly Language

 

  • Machine Code

     

    • When a C program is compiled, eventually it is translated into machine code
    • Currently x86-64 machine code, named for historical reasons
    • Original developed by Advanced Micro Devices (AMD) and was named AMD64
    • x86-64 machine code has evolved from 16-bit processors to current 64-bit; backwards compatability has been maintained

     

  • Assembly Language

     

    • In essence, a (somewhat) more readable version of machine code
    • Translation to machine code is almost one-to-one (i.e., each assembly language instruction often translates to one machine code instruction)
    • Instructions' names are used; no need to memorize the 8-bit encodings
    • Likewise for some operands (although hexadecimal is used to indicate addresses)
    • x64 Assembly

     

  • x86-64 Architecture

    hardware.jpg

     

  • What your program sees

    program.png

     

    • Only 2 types of memory: registers and main memory
    • Program counter register (called %rip in assembly language) keeps track of execution location in main memory (a 64-bit address)
    • Instruction register contains the current instruction (1-15 bytes on x86-64)
    • Machine code instructions are composed of an operation code (opcode) and its operands.
    • The first 1-3 bytes encodes the instruction (the op code)
      • Instructions are very low-level; e.g., 0x48 is the to op code for an instruction that copies data from one location to another
      • Each opcode has a fixed number of operands
      • Each operand specified a register or a (virtual) memory address

     

  • Registers

    registers.png

     

    • General-purpose integer register names are left over from when particular registers had specialized purposes
    • On 32-bit machines, there were only 8
    • Each register can hold an integer or an address (8 bytes = 64 bits)
    • Example instruction:

       

      AssemblyMachine code
      movq %rsp, %rbp5894855

     

  • Register Usage

    registers_usage.jpg

     

     

  • Example of assembly language and machine code

     

    int main() {
      printf("Hello\n");
      printf("Goodbye\n");
    }
    
    memory     machine code of     corresponding
    address    instruction in      assembly language
    address    that address
    (hex)      (hex)
    
    4004d0:    48 83 ec 08         sub    $0x8,%rsp
    4004d4:    bf e8 05 40 00      mov    $0x4005e8,%edi
    4004d9:    e8 da fe ff ff      callq  4003b8 <puts@plt>
    4004de:    bf ee 05 40 00      mov    $0x4005ee,%edi
    4004e3:    48 83 c4 08         add    $0x8,%rsp
    4004e7:    e9 cc fe ff ff      jmpq   4003b8 <puts@plt>
    
    addresses are 64 bytes; e.g
    
    0x00000000004004d0 is the address of the first instruction
    
    Using the Gnu debugger (gbd), we can inspect memory further:
    (gdb) x/s 0x4005e8
    0x4005e8 <__dso_handle+8>:       "Hello"
    (gdb) x/s 0x4005ee
    0x4005ee <__dso_handle+14>:      "Goodbye"

     

  • How it runs

     

    ./hello

     

    1. process is created (see slytinen406.cdm.depaul.edu)

       

    2. An address space is created; always 264 bytes

       

    3. The program now thinks it has 264 bytes of memory
      • In reality, part of this address space may be on disk, in main memory, and/or in cache memory. Much of it will not be anywhere.

       

    4. The starting address of the program in the address space is loaded into the CPU's program counter (%rip)
      • In the example above, 0x00000000004004d0
      • Based on the contents of the PC, The first (next) instruction is copied from memory into the instruction register
      • The program counter is incremented by the appropriate number (1-15, depends on the length of the instruction)
      • The instruction in the instruction register is executed. Some instructions modify the PC (to make loops, call other functions, etc.)
      • Repeat steps 3-5

     

  • Data formats

     

    • Assembly language has:

       

      • Bytes (8 bits)
      • word is 2 bytes
      • double word is 4 bytes
      • quad word is 8 bytes

       

    • Most instructions can operate on bytes, words, double words, or quad words

       

      C declarationData typeAMD64 suffixSize (bytes)
      charByteb1
      shortWordw2
      intDouble wordl4
      longQuad wordq8
      All pointersQuad wordq8

       

    • Note suffixes for different data types
    • Example: mov instruction has variants, depending on # of bytes
      • mov copies data (misnomer) from one location (memory or register) to another
      • movbmovw , movel, and movq

     

  • Assembly Language Example

     

     

    • C Code: Add two signed integers

       

      long add(long x, long y) {
        return x + y;
      }

       

    • Assembly

       

      .p2align 4,,15
      .globl add
              .type   add, @function
      add:
      .LFB0:
              .cfi_startproc
              movq    %rsi, %rax
              addq    %rdi, %rax
              ret
              .cfi_endproc
      .LFE0:
              .size   add, .-add
              .ident  "GCC: (GNU) 4.4.7 20120313 (Red Hat 4.4.7-16)"
              .section        .note.GNU-stack,"",@progbits

      Sort of like

       

      %eax = %esi
      %eax += %edi
      return %eax

       

    • Inspecting the executable with gdb

       

      (gdb) disas add
      Dump of assembler code for function add:
         0x00000000004005d0 <+0>     mov    %rsi,%rax
         0x00000000004005d3 <+3>     add    %rdi,%rax
         0x00000000004005d6 <+6>     retq
      End of assembler dump.
      (gdb) disas main
      Dump of assembler code for function main:
         0x0000000000400580 <+0>     sub    $0x18,%rsp
         0x0000000000400584 <+4>     mov    $0x4006d8,%edi
         0x0000000000400589 <+9>     callq  0x400460 <puts@plt>
         0x000000000040058e <+14>    lea    0x8(%rsp),%rsi
         0x0000000000400593 <+19>    mov    %rsp,%rdx
         0x0000000000400596 <+22>    mov    $0x4006e8,%edi
         0x000000000040059b <+27>    xor    %eax,%eax
         0x000000000040059d <+29>    callq  0x400480 <__isoc99_scanf@plt>
         0x00000000004005a2 <+34>    mov    (%rsp),%rsi
         0x00000000004005a6 <+38>    mov    0x8(%rsp),%rdi
         0x00000000004005ab <+43>    callq  0x4005d0 <add>
         0x00000000004005b0 <+48>    mov    $0x4006ef,%edi
         0x00000000004005b5 <+53>    mov    %rax,%rsi
         0x00000000004005b8 <+56>    xor    %eax,%eax
         0x00000000004005ba <+58>    callq  0x400450 <printf@plt>
         0x00000000004005bf <+63>    add    $0x18,%rsp
         0x00000000004005c3 <+67>    retq

     

  • Compilation

    gcc can take a combination of .c, .s, and .o files

     

    linux> gcc -o add.s add.c -S -O2
    linux> gcc -o add addmain.c add.s -O2
    linux> ./add
    Type 2 integers
    10 20
    30

     

  • Special registers

     

    • Generally not used for general purpose
    • %rip: instruction register. Almost never seen in assembly language code.
    • %rsp: stack pointer. Almost always used in one particular way (more later)
    • %rbp: base pointer. Almost always used in one particular way (more later)

     

  • Argument registers

     

    • Used for parameter passing, but may also be used for general purposes
    • Up to 6 parameters
    • More than that, remaining parameters are passed on the program stack

     

  • Return register

    %rax

     

  • Common instructions

    In 2-operand instructions, 1st operand is source, 2nd operand is destination. In 1-operand instructions, operand is destination.

     

    Instruction# operandsMeaning
    mov2Copy data from src to dest
    add2Add 2 items; sum placed in dest
    sub2Subtract src from dest; diff placed in dest
    imul2Multiply dest and src; product placed in dest
    cmp2Compare dest and src; condition code registers are set
    inc1Increment dest
    dec1Decrement dest
    neg1Negate
    not1Bitwise not
    and2Bitwise and
    or2Bitwise or
    xor2Bitwise xor
    shl2Left shift
    sar2Arithmetic Right shift
    shr2Logical Right shift
    lea2Load effective address (like &)

     

  • Examples

     

     

    int same(int x) {
      return x;
    }
    
    
    same:
            movl    %edi, %eax
            ret

     

     

    int add(int x, int y) {
      int s = x + y;
      return s;
    }
    
    
    add:
            movl    %rsi, %eax
            addl    %rdi, %eax
            ret

     

     

    long quadruple(long x) {
      return x * 4;
    }
    
    
    quadruple:
    .LFB0:
            leaq    (,%rdi,4), %rax    # %rax = x * 4
            ret

     

     

    int negative(int x) {
      return -x;
    }
    
    
    negative:
            movl    %edi, %eax
            negl    %eax
            ret

     

     

    long absolute_value(long x) {
      if (x >= 0)
        return x;
      else
        return -x;
    }
    
    
    absolute_value:
    .LFB0:
            movq    %rdi, %rdx     # %rdx = x
            sarq    $63, %rdx      # shift %rdx right arithmetic -- why?
            movq    %rdx, %rax     # %rax = %rdx 
            xorq    %rdi, %rax     # exclusive or :  x ^ %rax
            subq    %rdx, %rax     # %rax -= %rdx
            ret

     

  • set instructions

     

    • Usually occur immediately following cmp
    • Sets dest based on condition code registers (set by cmp)

       

      seteequal
      setnenot equal
      setggreater
      setgegreater or equal
      setlless
      setleless or equal

     

  • Examples

     

     

    int gt(int x, int y) {
      return x > y;
    }
    
    
    gt:
            xorl    %eax, %eax     # %eax = 0
            cmpl    %esi, %edi     # arg1 > arg2?   %edi > %esi?
            setg    %al            # store the result of the comparison in %al
            ret

     

     

    int istwice(int x, int y) {
      return x * 2 == y;
    }
    
    
    istwice:
    .LFB0:
            .cfi_startproc
            addl    %edi, %edi    # x *= 2
            xorl    %eax, %eax    # %eax = 0
            cmpl    %esi, %edi    # compare y and x
            sete    %al           # If equal, set %al to 1
            ret

     

     

  • Exercises

    What do these functions do?

     

     

    f1:
            movl    %edi, %eax
            xorl    $1, %eax
            andl    $1, %eax
            ret

     

     

    f2
            cmpl    %edi, %esi
            movl    %edi, %eax
            cmovle  %esi, %eax
            ret

     

     

    f3:
            subl    $97, %edi
            xorl    %eax, %eax
            cmpb    $25, %dil
            setbe   %al
            ret

     

     

  • Memory addressing

     

    • Finite number of registers
    • Eventually, "main memory" must be used to store working data
      • Virtual address space
    • Difficult to demonstrate with simple C programs, unless compilation is not optimized

       

      int add(int x, int y) {
        int s = x + y;
        return s;
      }
      
              pushq   %rbp               # next week
              movq    %rsp, %rbp         # next week
              movl    %edi, -4(%rbp)     # local variable a, stored in memory location -4(%rbp)
              movl    %esi, -8(%rbp)     # local variable b, stored in memory location -8(%rbp)
              movl    -4(%rbp), %eax     # compute a + b 
              addl    -8(%rbp), %eax     #     in %eax
              leave
      	ret

      Code is more like

       

      int add(int x, int y) {
        int a = x;
        int b = y;
        return a + b;
      }

     

  • Operand types

     

    • Immediate: constant value in decimal or Hex; number preceded by $
    • Register: starts with %
    • Memory reference
      • several different ways to specify an operand's memory address
      • Absolute: give memory location (rarely used)
      • Indirect: specify a register; it contains the memory location
      • Base + displacement: specify a register, add a value to its address (like pointer arithmetic)
      • Indexed: specify 2 registers, or 2 registers + a constant
      • Scaled indexed: multiply by the scale s

     

  • Some examples with movq

    move_examples.png

     

  • Indexed Addressing Modes

     

    • Memory operands can take many forms

       

    • Most General Form
      D(Rb,Ri,S)    Mem[Reg[Rb]+S*Reg[Ri]+ D]

       

    • Usually not all 3 items in parentheses

       

    • Examples:
       

      (%rbp) Item at memory address stored in register %rbp -4(%rbp) Item at (memory address stored in register %rbp) - 4 (%rbp,%rdx) Item at memory address computed by adding contents of %rbp and %rdx %rdx contents are not an address 8(%rbp,%rdx) Item at memory address computed by adding contents of %rbp and %rdx + 8 8(%rbp,%rdx,4) Item at memory address computed by adding contents of %rbp and (%rdx * 4) + 8

     

  • C examples

     

    • Assuming relevant data is in memory, not a register

       

       

      Data typeC exampleInstructionexample
      intint x = 3;movq $3,-4(%rbp)
      int []y[i] = 0;movq $0,-32(%rbp,%rdx,4)
      int *p = &x;leaq -4(%rbp),%rdx

       

    • Scale is dependent on datatype size

     

  • lea: Load Effective Address

     

     

  • Loads a memory address but does not retrieve/store data from that address

     

    leaq -4(%rbp), %rdx    Store the address computed by
                           subtracting 4 from the contents of %rbp

     

  • Can also be used for arithmetic computations

     

    leaq    (,%rdi,4), %rax    # %rax = x * 4

    does not access memory

 

  • Exercise

     

    • Assume the following values are stored at the indicated memory addresses and registers

       

      OperandValue
      0x1000x000000FF
      0x1040x000000AB
      0x1080x00000013
      0x10C0x00000011
       
      RegisterValue
      %rax0x0000000000000100
      %rcx0x0000000000000001
      %rdx0x0000000000000003
    • Fill in the following table showing the values for the indicated operands: 
       
      OperandValue
      %rax 
      $0x108 
      (%rax) 
      4(%rax) 
      9(%rax, %rdx) 
      256(%ecx, %rdx) 
      0xFC(, %ecx, 4) 
      (%rax, %rdx, 4) 

       

    • Fill in the following table showing the effect the instructions below. Assume values in memory and registers as specified above. Assume these instructions are not sequential.

       

      InstructionDestinationNew value in
      destination
      movq %rax, (%rax)  
      addl 4(%rax), %ecx  
      subq %rdx, (%rax, %rcx, 4)  
      movq $-1, 4(%rax)  
      movzbq $0x61, 4(%rax,%rcx,4)  
      movsbq $-1,%rcd  

       

       

    • jmp instruction and related

       

      • Usually based on immediately preceding cmp

         

        jmpUnconditional jump
        jeJump if equal
        jneJump if not equal
        jgJump if greater (dest > src)
        jgeJump if greater or equal (dest ≥ src)
        jlJump if less
        jleJump if less or equal

         

       

    • Flow of control

       

      • No if...else or looping constructs in Assembly language
      • Instead, it uses set and test for effect of simple if...else, and jmp for more complex control structures
      • Similar to goto statement in C

       

    • Example of goto in C

       

      #include 
      
      void print_equal(int x, int y) {
        if (x != y) {
          printf("Not equal\n");
          goto done;
        }
        printf("Equal\n");
       done: return;
      }
      
      
      .LC0:
              .string "Not equal"
      .LC1:
              .string "Equal"
              .text
              .p2align 4,,15
      .globl print_equal
              .type   print_equal, @function
      print_equal:
      .LFB11:
              .cfi_startproc
              cmpl    %esi, %edi
              je      .L2
              movl    $.LC0, %edi
              jmp     puts
              .p2align 4,,10
              .p2align 3
      .L2:
              movl    $.LC1, %edi
              jmp     puts

       

    • Example: scale.c

       

      void scale(char s1, char *s2) {
        if (s1 == 'F')
          *s2 = 'C';
        else if (s1 == 'C')
          *s2 = 'F';
      }
      
      
      scale:
              cmpb    $70, %dil        # s1 == 'F'
              je      .L6              # Yes? goto L6
              cmpb    $67, %dil        # s1 == 'C'
              je      .L7              # Yes? goto L7
              ret
      .L7:
              movb    $70, (%rsi)      # *s2 = 'F'
              ret
      .L6:
              movb    $67, (%rsi)      # *s2 = 'C'
              ret

       

    • Loop example

       

      int strlen406(char s[ ]) {
        int i=0;
        while (s[i] != '\0')
          i++;
        return i;
      }
      
      
      strlen406:
              xorl    %eax, %eax          # %eax = 0
      .L6:
              movzbl  (%rdi), %edx        # %edx = *s
              addl    $1, %eax            # %eax += 1
              testb   %dl, %dl            # Test to see if *s is 0?
              addq    $1, %rdi            # s += 1
              jne     .L6                 # *s == 0?  If not, goto L6
              ret

       

    • Another loop example

       

      f:
              testl   %edi, %edi
              movl    $1, %edx
              movl    $1, %eax
              jle     .L3
      .L6:
              imull   %edx, %eax
              addl    $1, %edx
              cmpl    %edx, %edi
              jge     .L6
      .L3:
              ret

       

    • Some reverse engineering examples

      See this assembly language. What does each function do?

       

    • Example with pointers

      swap.png

      swap_diagram.png

       

    • Program stack

       

      • Function environments are stored on a portion of the address space called the program stack

        programstack.png

        Text, p.190: "By convention, we draw stacks upside down, so that the "top" of the stack is shown at the bottom." My diagrams do not follow this convention. The top of the stack in my diagrams are at the top.

         

      • When a function is called, it creates a stack frame when necessary

        functioncalls.png

         

      • The stack grows and shrinks as functions are called and return.

        returnh.png

       

    • Stack frame organization

      stackframe.png

       

      • Compiler optimization eliminates the need for stack frames for some functions

       

    • push and pop

       

      • When a function is called, it forms its own stack frame
      • It also needs to store enough inforamtion for the previous stack frame to be restored upon return
        f() {
          ...
          g()
        }

        Start of g:

         

        push %rbp
        movq %rsp, %rbp

         

      • push %rbp is a "macro"; short for 2 instructions

         

        subq $8,%rsp
        movq %rbp,(%rsp)

         

      • pop %rbp is short for

         

        movq (%rsp), %rbp
        addq $8, %rsp

       

       

       

    • Example

       

      int f(int x) {
        return g(x);
      }
      
      int g(int y) {
        return y * 2;
      }

      Assembly language for g

       

      Compiler generates          Equivalent
      
      g:                                  
              pushq   %rbp                subq $8, $rsp
                                          movq %rbp, (%rsp)
              movq    %rsp, %rbp          movq %rsp, %rbp
              subq    $16, %rsp           
              movl    %edi, -4(%rbp)      ...
              movl    -4(%rbp), %eax
              addl    %eax, %eax
              addq    $16, %rsp
              leave                       movq (%rsp), %rbp
              ret

       

       

    • Creation of new stack frame

      g_stackframe.png

       

    • popq

       

      • Pops the program stack

         

        popq %ebp

        is equivalent to

         

        movq (%esp), %ebp
        addq $8, %esp

         

        Compiler generates          Equivalent
        
        g:                             
                pushq   %rbp           
                movq    %rsp, %rbp     
                subq    $4, %rsp       
                movl    %edi, -4(%rbp) 
                movl    -4(%rbp), %eax
                addl    %eax, %eax
                addq    $4, %rsp            addq $4, %rsp
                leave                       movq (%esp),%ebp
                ret                         addq $8, %rsp
                                            ret

       

    • call and ret

       

      • Also macros; callq 0x400f01 means

         

        pushq %rip
        jmpq   0x400f01
        and ret means

         

        popq %rip
        jmpq *%rip   # this is an indirect jump

        return.png

       

    • Function parameters and the program stack

       

      • In x86-64, parameters are usually passed through argument registers
      • %rdi, %rsi, %rdx, %rcx, %r8, %r9
      • In the unlikely event that more than 6 parameters are passed to a function, they are passed on the program stack
      • To illustrate, we will switch to IA-32

         

        int g(int,int);
        
        int f() {
          int a = 1 , b = 2;
          return g(a,b);
        }
        
        int g(int x, int y) {
          return x + y;
        }
        
        int main() { 
            f();
        }
        
        f: 
           pushl   %ebp
           movl    %esp, %ebp
           subl    $24, %esp
           movl    $1, -8(%ebp)     # local var a
           movl    $2, -4(%ebp)     # local var b
           movl    -4(%ebp), %eax
           movl    %eax, 4(%esp)    # pass b as parameter
           movl    -8(%ebp), %eax   # pass a as parameter 
           movl    %eax, (%esp)
           call    g
           leave
           ret
        
        g:
           pushl   %ebp
           movl    %esp, %ebp
           movl    12(%ebp), %eax   # parameter y
           movl    8(%ebp), %edx    # parameter x
           addl    %edx, %eax
           popl    %ebp
           ret
        parameters1.png

        parameters2.png

         

      • Note that positive offsets/displacements refer to parameters, because of the organization of the program stack

       

    • x86-64 Example

       

      int main() {
        long x;
        printf("Value\n");  
        scanf("%d", &x);
        x = times4(x);
        printf("4 * x = %d\n", 
               times4(x));  
      
      }
      
      long times4(long x) {
        long total = x;
        total += times3(x);
        return total;
      }
      
      long times3(long x) {
        long total = x;
        return x * times2(x);
      }
      
      long times2(long x) {
        return x + x; 
      }

       

      Debugging session

       

      C code

       

      int main() {
        long x;
        printf("Value\n");  
        scanf("%d", &x);
        x = times4(x);
        printf("4 * x = %d\n", 
               times4(x));  
      
      }
      
      long times4(long x) {
        long total = x;
        total += times3(x);
        return total;
      }
      gdb inspection

       

      (gdb) disas main
      
      ...
      0x4005ab <+55>:    callq  0x4005c8 <times4>
      0x4005b0 <+60>:    mov    %eax,%edx
      ...
      
      (gdb) disas times4
      
      0x4005c8 <+0>:     push   %rbp
      0x4005c9 <+1>:     mov    %rsp,%rbp
      0x4005cc <+4>:     sub    $0x8,%rsp
      0x4005d0 <+8>:     mov    %rdi,-0x8(%rbp)
      0x4005d4 <+12>:    callq  0x4005e4 <times3>
      0x4005d9 <+17>:    add    %rax,-0x8(%rbp)
      0x4005dd <+21>:    mov    -0x8(%rbp),%rax
      0x4005e1 <+25>:    leaveq
      0x4005e2 <+26>:    retq
      
      (gdb) break *times4
      (gdb) run
      
      Value
      10
      long times4(long x) {
        long total = x;
        total += times3(x);
        return total;
      }
      Breakpoint 1, 0x4005c8 in times4 ()
      
      (gdb) print/x $rbp
      $5 = 0x7fffffffea00  <-- main bottom of stack
      (gdb) print/x $rsp
      $6 = 0x7fffffffe9e8  <-- top of stack
      (gdb) x/x $rsp
      0x004005b0          <-- main ret addr
      (gdb) print/d $rdi
      $8 = 10              <-- parameter
      (gdb) cont
      long times4(long x) {
        long total = x;
        total += times3(x);
        return total;
      }
      Breakpoint 5, 0x4005d4 in times4 ()
      (gdb) disas
      ...
         0x4005c8 <+0>:     push   %rbp
         0x4005c9 <+1>:     mov    %rsp,%rbp
         0x4005cc <+4>:     sub    $0x8,%rsp
         0x4005d0 <+8>:     mov    %rdi,-0x8(%rbp)
      => 0x4005d4 <+12>:    callq  0x4005e4 <times3>
         0z4005d9 <+17>:    add    %rax,-0x8(%rbp)
      ...
      (gdb) print/x $rbp
      $9 = 0x7fffffffe9e0       <--- times4 bottom
      (gdb) print/x $rsp       
      $10 = 0x7fffffffe9d8      <--- top
      (gdb) break *times4
      (gdb) break *times3
      Breakpoint 1 at 0x4005e4
      (gdb) break *times3+12
      Breakpoint 2 at 0x4005f0
      (gdb) run
      Starting program:
      Value for x
      10
      long times3(long x) {
        long total = x;
        total += times2(x);
        return total;
      Breakpoint 1, 0x00000000004005e4 in times3 ()
      (gdb) print/x $rbp
      $1 = 0x7fffffffe9e0
      (gdb) print/x $rsp
      $2 = 0x7fffffffe9d0
      (gdb) x/x $rsp
      0x7fffffffe9d0: 0x004005d9
      Breakpoint 4 at 0x4005f0
      (gdb) cont
      Continuing.

       

      long times3(long x) {
        long total = x;
        total += times2(x);
        return total;
      Breakpoint 2, 0x00000000004005f0 in times3 ()
      (gdb) print/x $ebp
      $3 = 0xffffe9c8
      (gdb) print/x $rbp
      $4 = 0x7fffffffe9c8
      (gdb) print/x $rsp
      $5 = 0x7fffffffe9c0
      (gdb) x/x $rsp
       0x0000000a
      (gdb) print/x $rdi
      $8 = 0xa
      (gdb) disas times2
      0x400600 <+0>:     lea    (%rdi,%rdi,1),%rax
      0x400604 <+4>:     retq
      long times2(long x) {
        retrurn x + x;
      }
      (gdb) break *times2
      Breakpoint 5 at 0x400600
      (gdb) break *times3 + 17
      Breakpoint 6 at 0x4005f5
      (gdb) cont
      Continuing.
      
      Breakpoint 5, 0x0000000000400600 in times2 ()
      (gdb) print/x $rbp
      $10 = 0x7fffffffe9c8
      (gdb) print/x $rsp
      $12 = 0x7fffffffe9b8
      (gdb) x/x $rsp
      0x7fffffffe9b8: 0x004005f5
      (gdb) x/i 0x004005f5
      0x4005f5 :        add    %rax,-0x8(%rbp)
      int times3(int x) {
        int total = x;
        total += times2(x);
        return total;
      }
      (gdb) cont
      Continuing.
      
      Breakpoint 6, 0x00000000004005f5 in times3 ()
      (gdb) print/x $rsp
      $13 = 0x7fffffffe9c0
      (gdb) print/x $rbp
      $14 = 0x7fffffffe9c8
      (gdb) print/d $rax
      $16 = 20
      (gdb) x/x $rbp
      0x7fffffffe9c8: 0xffffe9e0
      (gdb) x/x $rbp+8
      0x7fffffffe9d0: 0x004005d9
      (gdb) x/i 0x004005d9
         0x4005d9 :        add    %rax,-0x8(%rbp)
      long times4(long x) {
        long total = x;
        total += times3(x);
        return total;
      }
      Breakpoint 3, 0x00000000004005d9 in times4 ()
      (gdb) print/x $rbp
      $18 = 0x7fffffffe9e0
      (gdb) print/x $rsp
      $19 = 0x7fffffffe9d8
      (gdb) print/d $rax
      $20 = 30
      (gdb) cont
      Continuing.
      4 * x = 40

       

      •  

       

    • Recursion

       

      • recursive function calls itself

         

        • C example: compute n!

           

          int f(int n) {
            if (n == 1) return 1;
            else return n * f(n-1)
          }

           

        • Assembly

           

          f:
              pushq   %rbp                .L2:
              movq    %rsp, %rbp                movl    -4(%rbp), %eax 
              subq    $4, %rsp                  subl    $1, %eax
              movl    %edi, -4(%rbp)            movl    %eax, %edi
              cmpl    $1, -4(%rbp)              call    f
              jne     .L2                       imull   -4(%rbp), %eax
              movl    $1, %eax            .L3:
              jmp     .L3                       addq $4, %rsp 
                                                popq %ebp
                                                ret

           

        • A new stack frame is created each time this function is called (including recursive calls)
        • For each call to factorial, the Argument 1 register (%rdi) must contain the parameter n
        • But n changes, so the stack frames for each instance of factorial has to allocate 4 bytes in order to store n

         

      • Step-by-step

        Call f(4)

        f:
                pushq   %rbp                .L2:
                movq    %rsp, %rbp                movl    -4(%rbp), %eax 
                subq    $4, %rsp                  subl    $1, %eax
                movl    %edi, -4(%rbp)            movl    %eax, %edi
                cmpl    $1, -4(%rbp)              call    f
                jne     .L2                       imull   -4(%rbp), %eax
                movl    $1, %eax            .L3:
                jmp     .L3                        leave
                                                   ret
        recursion.png










       

转载于:https://my.oschina.net/tsh/blog/1613687

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
This book introduces programmers to 64 bit Intel assembly language using the Microsoft Windows operating system. The book also discusses how to use the free integrated development environment, ebe, designed by the author specifically to meet the needs of assembly language programmers. Ebe is a C++ program which uses the Qt library to implement a GUI environment consisting of a source window, a data window, a register window, a floating point register window, a backtrace window, a console window, a terminal window, a project window and a pair of teaching tools called the "Toy Box" and the "Bit Bucket". The source window includes a full-featured text editor with convenient controls for assembling, linking and debugging a program. The project facility allows a program to be built from C source code files and assembly source files. Assembly is performed automatically using the yasm assembler and linking is performed with ld or gcc. Debugging operates by transparently sending commands into the gdb debugger while automatically displaying registers and variables after each debugging step. The Toy Box allows the use to enter variable definitions and expressions in either C++ or Fortran and it builds a program to evaluate the expressions. Then the user can inspect the format of each expression. The Bit Bucket allows the user to explore how the computer stores and manipulates integers and floating point numbers. Additional information about ebe can be found at http://www.rayseyfarth.com. The book is intended as a first assembly language book for programmers experienced in high level programming in a language like C or C++. The assembly programming is performed using the yasm assembler automatically from the ebe IDE under the Linux operating system. The book primarily teaches how to write assembly code compatible with C programs. The reader will learn to call C functions from assembly language and to call assembly functions from C in addition to writing complete programs in assembly language. The gcc compiler is used internally to compile C programs. The book starts early emphasizing using ebe to debug programs. Being able to single-step assembly programs is critical in learning assembly programming. Ebe makes this far easier than using gdb directly. Highlights of the book include doing input/output programming using Windows API functions and the C library, implementing data structures in assembly language and high performance assembly language programming. Early chapters of the book rely on using the debugger to observe program behavior. After a chapter on functions, the user is prepared to use printf and scanf from the C library to perform I/O. The chapter on data structures covers singly linked lists, doubly linked circular lists, hash tables and binary trees. Test programs are presented for all these data structures. There is a chapter on optimization techniques and 3 chapters on specific optimizations. One chapter covers how to efficiently count the 1 bits in an array with the most efficient version using the recently-introduced popcnt instruction. Another chapter covers using SSE instructions to create an efficient implementation of the Sobel filtering algorithm. The final high performance programming chapter discusses computing correlation between data in 2 arrays. There is an AVX implementation which achieves 20.5 GFLOPs on a single core of a Core i7 CPU. A companion web site, http://www.rayseyfarth.com, has a collection of PDF slides which instructors can use for in-class presentations and source code for sample programs.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值