NASM Examples
Getting Started
Here is a very short NASM program that displays "Hello, World" on a line then exits. Like most programs on this page, you link it with a C library:
; ----------------------------------------------------------------------------
; helloworld . asm
;
; This is a Win32 console program that writes "Hello, World" on one line and
; then exits . It needs to be linked with a C library .
; ----------------------------------------------------------------------------
global _main
extern _printf
section . text
_main :
push message
call _printf
add esp , 4
ret
message :
db 'Hello, World' , 10 , 0
To assemble, link and run this program under Windows:
nasm -fwin32 helloworld.asm gcc helloworld.obj a
Under Linux, you'll need to remove the leading underscores from function names, and execute
nasm -felf helloworld.asm gcc helloworld.o ./a.out
Understanding Calling Conventions
If you are writing assembly language functions that will link with C, and you're using gcc, you must obey the gcc calling conventions. These are:
- Parameters are pushed on the stack, right to left, and are removed by the caller after the call.
- After the parameters are pushed, the call instruction is made, so when the called function gets control, the return address is at [esp] , the first parameter is at [esp+4] , etc.
- If you want to use any of the following registers: ebx , esi , edi , ebp , ds , es , ss , you must save and restore their values. In other words, these values must not change across function calls. When you make calls, you can assume these won't change (as long as everyone plays by the rules).
- A function that returns an integer value should return it in eax , a 64-bit integer in edx:eax , and a floating point value should be returned on the fpu stack top.
This program prints the first few fibonacci numbers, illustrating how registers have to be saved and restored:
; ----------------------------------------------------------------------------
; fib . asm
;
; This is a Win32 console program that writes the first 40 Fibonacci numbers .
; It needs to be linked with a C library .
; ----------------------------------------------------------------------------
global _main
extern _printf
section . text
_main :
push ebx ; we have to save this since we use it
mov ecx , 40 ; ecx will countdown from 40 to 0
xor eax , eax ; eax will hold the current number
xor ebx , ebx ; ebx will hold the next number
inc ebx ; ebx is originally 1
print :
; We need to call printf , but we are using eax , ebx , and ecx . printf
; may destroy eax and ecx so we will save these before the call and
; restore them afterwards .
push eax
push ecx
push eax
push format
call _printf
add esp , 8
pop ecx
pop eax
mov edx , eax ; save the current number
mov eax , ebx ; next number is now current
add ebx , edx ; get the new next number
dec ecx ; count down
jnz print ; if not done counting , do some more
pop ebx ; restore ebx before returning
ret
format :
db '%10d' , 0
Mixing C and Assembly Langauge
This program is just a simple function that takes in three integer parameters and returns the maximum value. It shows that the parameters will be at [esp+4] , [esp+8] and [esp+12] , and that the value gets returned in eax .
; ----------------------------------------------------------------------------
; maxofthree . asm
;
; NASM implementation of a function that returns the maximum value of its
; three integer parameters . The function has prototype :
;
; int maxofthree ( int x , int y , int z )
;
; Note that only eax , ecx , and edx were used so no registers had to be saved
; and restored .
; ----------------------------------------------------------------------------
global _maxofthree
section . text
_maxofthree :
mov eax , [ esp + 4 ]
mov ecx , [ esp + 8 ]
mov edx , [ esp + 12 ]
cmp eax , ecx
cmovl eax , ecx
cmp eax , edx
cmovl eax , edx
ret
Here is a C program that calls the assembly language function.
/*
* callmaxofthree.c
*
* Illustrates how to call the maxofthree function we wrote in assembly
* language.
*/
#include <stdio.h>
int maxofthree ( int , int , int );
int main () {
printf ( "%d/n" , maxofthree ( 1 , - 4 , - 7 ));
printf ( "%d/n" , maxofthree ( 2 , - 6 , 1 ));
printf ( "%d/n" , maxofthree ( 2 , 3 , 1 ));
printf ( "%d/n" , maxofthree (- 2 , 4 , 3 ));
printf ( "%d/n" , maxofthree ( 2 , - 6 , 5 ));
printf ( "%d/n" , maxofthree ( 2 , 4 , 6 ));
return 0 ;
}
To assemble, link and run this two-part program (on Windows):
nasm -fwin32 maxofthree.asm gcc callmaxofthree.c maxofthree.obj a
Command Line Arguments
You know that in C, main is just a plain old function, and it has a couple parameters of its own:
int main(int argc, char** argv)
Here is a program that uses this fact to simply echo the commandline arguments to a program, one per line:
; ----------------------------------------------------------------------------
; echo . asm
;
; NASM implementation of a program that displays its commandline arguments ,
; one per line .
; ----------------------------------------------------------------------------
global _main
extern _printf
section . text
_main :
mov ecx , [ esp + 4 ] ; argc
mov edx , [ esp + 8 ] ; argv
top :
push ecx ; save registers that printf wastes
push edx
push dword [ edx ] ; the argument string to display
push format ; the format string
call _printf
add esp , 8 ; remove the two parameters
pop edx ; restore registers printf used
pop ecx
add edx , 4 ; point to next argument
dec ecx ; count down
jnz top ; if not done counting keep going
ret
format :
db '%s' , 10 , 0
Note that as far as the C Library is concerned, command line arguments are always strings. If you want to treat them as integers, call atoi . Here's a neat program to compute xy .
; ----------------------------------------------------------------------------
; power . asm
;
; Command line application to compute x ^ y
; Syntax : power x y
; x and y are integers
; ----------------------------------------------------------------------------
global _main
extern _atoi
extern _printf
section . text
_main :
push ebx ; save the registers that must be saved
push esi
push edi
mov eax , [ esp + 16 ] ; argc ( it 's not at [esp+4] now :-))
cmp eax, 3 ; must have exactly two arguments
jne error1
mov ebx, [esp+20] ; argv
push dword [ebx+4] ; argv[1]
call _atoi
add esp, 4
mov esi, eax ; x in esi
push dword [ebx+8]
call _atoi ; argv[2]
add esp, 4
cmp eax, 0
jl error2
mov edi, eax ; y in edi
mov eax, 1 ; start with answer = 1
check:
test edi, edi ; we' re counting y downto 0
jz gotit ; done
imul eax , esi ; multiply in another x
dec edi
jmp check
gotit : ; print report on success
push eax
push answer
call _printf
add esp , 8
jmp done
error1 : ; print error message
push badArgumentCount
call _printf
add esp , 4
jmp done
error2 : ; print error message
push negativeExponent
call _printf
add esp , 4
done : ; restore saved registers
pop edi
pop esi
pop ebx
ret
answer :
db '%d' , 10 , 0
badArgumentCount :
db 'Requires exactly two arguments' , 10 , 0
negativeExponent :
db 'The exponent may not be negative' , 10 , 0
Floating Point Instructions
Here is an example that uses only two floating point instructions,fldz and fadd .
; ----------------------------------------------------------------------------
; sum . asm
;
; NASM implementation of a function that returns the sum of all the elements
; in a floating - point array . The function has prototype :
;
; double sum ( double [] array , int length )
; ----------------------------------------------------------------------------
global _sum
section . text
_sum :
mov edx , [ esp + 4 ] ; address of argument
mov ecx , [ esp + 8 ] ; length of array
fldz ; initialize the sum to 0
cmp ecx , 0 ; guard against non - positive lengths !
jle done
next :
fadd qword [ edx ] ; add in the current array element
add edx , 8 ; move to next array element
dec ecx ; count down
jnz next ; if not done counting , continue
done :
ret ; return value already in st0
Data Sections
The text section is read-only on most operating systems, so you might find the need for a data section. On most operating systems, the data section is only for initialized data, and you have a special .bss section for uninitialized data. Here is a program that averages the command line arguments, expected to be integers, and displays the result as a floating point number. Note that there is no instruction to push an 8-byte value, so we fake it by manipulating esp.
; ----------------------------------------------------------------------------
; average . asm
;
; NASM implementation of a program that treats all its command line arguments
; as integers , as displays their average as a floating point number . This
; program uses a data section to store intermediate results , not that it has
; to , but only to illustrate how data sections are used .
; ----------------------------------------------------------------------------
global _main
extern _printf
extern _atoi
section . text
_main :
mov ecx , [ esp + 4 ] ; argc
dec ecx ; don 't count program name
jz nothingToAverage
mov [count], ecx ; save number of real arguments
mov edx, [esp+8] ; argv
accumulate:
push ecx ; save values across call to atoi
push edx
push dword [edx+ecx*4] ; argv[ecx]
call _atoi ; now eax has the int value of arg
add esp, 4
pop edx ; restore registers after atoi call
pop ecx
add [sum], eax ; accumulate sum as we go
dec ecx
jnz accumulate ; more arguments?
average:
fild dword [sum]
fild dword [count]
fdivp st1, st0 ; sum / count
sub esp, 8 ; make room for quotient on stack
fstp qword [esp] ; "push" quotient
push format ; push format string
call _printf
add esp, 12 ; 4 bytes format, 8 bytes number
ret
nothingToAverage:
push error
call _printf
add esp, 4
ret
section .data
count: dd 0
sum: dd 0
format: db ' %. 15f ', 10, 0
error: db ' There are no command line arguments to average ', 10, 0
Recursion
Perhaps surprisingly, there's nothing out of the ordinary required to implement recursive functions. You push parameters on the stack, after all! Here's an example. In C
int factorial(int n) { return (n <= 1) ? 1 : n * factorial(n-1); }
In assembly language:
; ----------------------------------------------------------------------------
; factorial . asm
;
; Illustration of a recursive function .
; ----------------------------------------------------------------------------
global _factorial
section . text
_factorial :
mov eax , [ esp + 4 ] ; n
cmp eax , 1 ; n <= 1
jnle L1 ; if not , go do a recursive call
mov eax , 1 ; otherwise return 1
jmp L2
L1 :
dec eax ; n - 1
push eax ; push argument
call _factorial ; do the call , result goes in eax
add esp , 4 ; get rid of argument
imul eax , [ esp + 4 ] ; n * factorial ( n - 1 )
L2 :
ret
SIMD Parallelism
The 64-bit MMX registers can do eight byte operations in parallel, or four (16-bit) word operations in parallel, or two (32-bit) doubleword operations in parallel. The 128-bit XMMs can do 16 byte, 8 word, or 4 doubleword operations in parallel, and do parallel floating-point computations too (4 single precision or 2 double precision). Here is a simple function that sums two arrays of 16-bit short ints, four at a time:
; ----------------------------------------------------------------------------
; mmxarrayadd . asm
;
; NASM implementation of a function that adds two short arrays .
;
; void add ( short a [], short b [], int n )
; ----------------------------------------------------------------------------
global _add
section . text
_add :
push ebx ; callee save register
mov eax , [ esp + 8 ] ; eax points to a
mov edx , [ esp + 12 ] ; edx points to b
mov ecx , [ esp + 16 ] ; ecx <- number of items in each array
or ecx , ecx ; guard against negative lengths
js L4
L1 :
cmp ecx , 4 ; Less than 4 items left ?
jl L2 ; if so , handle them individually
movq mm0 , qword [ eax ] ; Get four items from a
paddw mm0 , qword [ edx ] ; Add them with next four items from b
movq qword [ eax ], mm0 ; Write them back to a
add eax , 8 ; Advance a to point to next 4 words
add edx , 8 ; Advance b to point to next 4 words
sub ecx , 4 ; We 've just handled four
jmp L1
L2:
jecxz L4 ; Are there zero items left?
L3:
mov bx, word [eax] ; One word at a time addition
add bx, word [edx]
mov word [eax], bx
inc eax
inc eax
inc edx
inc edx
dec ecx
jnz L3
L4:
pop ebx
emms
ret
Here's another one
; ----------------------------------------------------------------------------
; sseexample . asm
;
; This program demonstrates a few SSE instructions , for no particular reason
; other than to show them off .
; ----------------------------------------------------------------------------
extern _printf
global _main
section . text
_main :
push esi ; callee save register
; Illustrate packed square root computations
movups xmm3 , [ x ]
sqrtps xmm0 , xmm3
movups [ y ], xmm0
call printall
; Illustrate packed maximums
movups xmm2 , [ x ]
movups xmm5 , [ z ]
maxps xmm2 , xmm5
movups [ y ], xmm2
call printall
; Done
pop esi
ret
printall :
mov esi , 4
printone :
; Note printf will NOT ACCEPT single precision floats .
; We have to convert them to double precision floats . Sigh .
fld dword [ y - 4 + esi * 4 ]
sub esp , 8
fstp qword [ esp ]
push format
call _printf
add esp , 12
dec esi
jnz printone
ret
section . data
align 16
x dd 10.0
dd 100.0
dd 400.0
dd 653.2664
y dd 0.0
dd 0.0
dd 0.0
dd 0.0
z dd 5.0
dd 900.0
dd 316.20
dd 111.0
format db '%15.7f' , 10 , 0
Saturated Arithmetic
This program illustrates saturated addition.
; ----------------------------------------------------------------------------
; satexample . asm
;
; This is a short example of parallel saturated addition using paddsw .
; It takes two 64 - bit quantities
;
; 80008FFF0005FEF2
; 800020E07FFE99AA
;
; and performs saturated addition on the four 16 - bit blocks in parallel ,
; then writes the resulting value , in hex , to standard output . The answer
; should be
;
; 8000B0DF7FFF989C
; ----------------------------------------------------------------------------
extern _printf
global _main
section . text
_main :
movq mm0 , [ x ]
paddsw mm0 , [ y ] ; Do 4 saturated additions in parallel
movq [ x ], mm0
push dword [ x ] ; can 't push 64 bits at once
push dword [x+4] ; nor does printf handle 64-bit ints
push format
call _printf
add esp, 12
ret
section .data
x dw 0fef2h, 0005h, 8fffh, 8000h
y dw 099aah, 7ffeh, 20e0h, 8000h
format db ' % 0x % 0x ', 10, 0
Graphics
You probably the OpenGL graphics library already on your system, so why not call it from an assembly language program:
; ----------------------------------------------------------------------------
; triangle . asm
;
; A very simple * Windows * OpenGL application using the GLUT library . It
; draws a nicely colored triangle in a top - level application window . One
; interesting thing is that the Windows GL and GLUT functions do NOT use the
; C calling convention ; instead they use the "stdcall" convention which is
; like C except that the callee pops the parameters .
; ----------------------------------------------------------------------------
global _main
extern _glClear@4
extern _glBegin@4
extern _glEnd@0
extern _glColor3f@12
extern _glVertex3f@12
extern _glFlush@0
extern _glutInit@8
extern _glutInitDisplayMode@4
extern _glutInitWindowPosition@8
extern _glutInitWindowSize@8
extern _glutCreateWindow@4
extern _glutDisplayFunc@4
extern _glutMainLoop@0
section . text
title : db 'A Simple Triangle' , 0
zero : dd 0.0
one : dd 1.0
half : dd 0.5
neghalf : dd - 0.5
display :
push dword 16384
call _glClear@4 ; glClear ( GL_COLOR_BUFFER_BIT )
push dword 9
call _glBegin@4 ; glBegin ( GL_POLYGON )
push dword 0
push dword 0
push dword [ one ]
call _glColor3f@12 ; glColor3f ( 1 , 0 , 0 )
push dword 0
push dword [ neghalf ]
push dword [ neghalf ]
call _glVertex3f@12 ; glVertex (-. 5 , -. 5 , 0 )
push dword 0
push dword [ one ]
push dword 0
call _glColor3f@12 ; glColor3f ( 0 , 1 , 0 )
push dword 0
push dword [ neghalf ]
push dword [ half ]
call _glVertex3f@12 ; glVertex (. 5 , -. 5 , 0 )
push dword [ one ]
push dword 0
push dword 0
call _glColor3f@12 ; glColor3f ( 0 , 0 , 1 )
push dword 0
push dword [ half ]
push dword 0
call _glVertex3f@12 ; glVertex ( 0 , . 5 , 0 )
call _glEnd@0 ; glEnd ()
call _glFlush@0 ; glFlush ()
ret
_main :
push dword [ esp + 8 ] ; push argv
lea eax , [ esp + 8 ] ; get addr of argc ( offset changed :-)
push eax
call _glutInit@8 ; glutInit (& argc , argv )
push dword 0
call _glutInitDisplayMode@4
push dword 80
push dword 80
call _glutInitWindowPosition@8
push dword 300
push dword 400
call _glutInitWindowSize@8
push title
call _glutCreateWindow@4
push display
call _glutDisplayFunc@4
call _glutMainLoop@0
ret
Local Variables
After entering a function, we can reserve space for local variables by decrementing the stack pointer. For example, the C function
int example(int x, int y) { int a, b, c; b = 7; return x * b + y; }
can be translated as follows:
_example: sub esp, 12 ; make room for 3 ints mov dword [esp+4], 7 ; b = 7 mov eax, [esp+16] ; x imul eax, [esp+4] ; x * b add eax, [esp+20] ; x * b + y ret
After "sub esp, 12 " the stack looks like:
+---------+ esp | a | +---------+ esp+4 | b | +---------+ esp+8 | c | +---------+ esp+12 | retaddr | +---------+ esp+16 | x | +---------+ esp+20 | y | +---------+
Stack Frames
Sometimes it is a real pain to try to keep track of the offsets of your parameters and local variables because the stack pointer keeps changing. For example, in
int example(int x, int y) { int a, b, c; ... f(y, a, b, b, x); ... }
you cannot translate the function call as
push dword [esp+16] push dword [esp+4] ; WRONG! b is really now at [esp+8] push dword [esp+4] ; WRONG! b is really now at [esp+12] push dword [esp] ; WRONG! a is really now at [esp+12] push dword [esp+20] ; WRONG! y is really now at [esp+36] call f
For this reason, many functions use the ebp register to index the "stack frame" of local variables and parameters, like this:
push ebp ; must save old ebp mov ebp, esp ; point ebp to this frame sub esp, ___ ; make space for locals ... mov esp, ebp ; clean up locals pop ebp ; restore old ebp ret
As long as you never change ebp throughout the function, all your local variables and parameters will always be at the same offset from ebp . The stack frame for our example function is now:
+---------+ ebp-12 | a | +---------+ ebp-8 | b | +---------+ ebp-4 | c | +---------+ ebp | old ebp | +---------+ ebp+4 | retaddr | +---------+ ebp+8 | x | +---------+ ebp+12 | y | +---------+
http://www.cs.lmu.edu/~ray/notes/nasmexamples/