NASM Examples

 

NASM Examples

Getting Started

Here is a very short NASM program that displays "Hello, World" on a line then exits. Like most programs on this page, you link it with a C library:

helloworld.asm
;
 
----------------------------------------------------------------------------

; helloworld . asm
;
; This is a Win32 console program that writes "Hello, World" on one line and
; then exits . It needs to be linked with a C library .
; ----------------------------------------------------------------------------

global _main
extern _printf

section
. text
_main
:
push message
call _printf
add esp
, 4
ret
message
:
db
'Hello, World' , 10 , 0

To assemble, link and run this program under Windows:

    nasm -fwin32 helloworld.asm
    gcc helloworld.obj
    a

Under Linux, you'll need to remove the leading underscores from function names, and execute

    nasm -felf helloworld.asm
    gcc helloworld.o
    ./a.out

Understanding Calling Conventions

If you are writing assembly language functions that will link with C, and you're using gcc, you must obey the gcc calling conventions. These are:

  • Parameters are pushed on the stack, right to left, and are removed by the caller after the call.
  • After the parameters are pushed, the call instruction is made, so when the called function gets control, the return address is at [esp] , the first parameter is at [esp+4] , etc.
  • If you want to use any of the following registers: ebx , esi , edi , ebp , ds , es , ss , you must save and restore their values. In other words, these values must not change across function calls. When you make calls, you can assume these won't change (as long as everyone plays by the rules).
  • A function that returns an integer value should return it in eax , a 64-bit integer in edx:eax , and a floating point value should be returned on the fpu stack top.

This program prints the first few fibonacci numbers, illustrating how registers have to be saved and restored:

fib.asm
;
 
----------------------------------------------------------------------------

; fib . asm
;
; This is a Win32 console program that writes the first 40 Fibonacci numbers .
; It needs to be linked with a C library .
; ----------------------------------------------------------------------------

global _main
extern _printf

section
. text
_main
:
push ebx
; we have to save this since we use it

mov ecx
, 40 ; ecx will countdown from 40 to 0
xor eax
, eax ; eax will hold the current number
xor ebx
, ebx ; ebx will hold the next number
inc ebx
; ebx is originally 1
print :
; We need to call printf , but we are using eax , ebx , and ecx . printf
; may destroy eax and ecx so we will save these before the call and
; restore them afterwards .

push eax
push ecx

push eax
push format
call _printf
add esp
, 8

pop ecx
pop eax

mov edx
, eax ; save the current number
mov eax
, ebx ; next number is now current
add ebx
, edx ; get the new next number
dec ecx
; count down
jnz
print ; if not done counting , do some more

pop ebx
; restore ebx before returning
ret
format
:
db
'%10d' , 0

Mixing C and Assembly Langauge

This program is just a simple function that takes in three integer parameters and returns the maximum value. It shows that the parameters will be at [esp+4] , [esp+8] and [esp+12] , and that the value gets returned in eax .

maxofthree.asm
;
 
----------------------------------------------------------------------------

; maxofthree . asm
;
; NASM implementation of a function that returns the maximum value of its
; three integer parameters . The function has prototype :
;
; int maxofthree ( int x , int y , int z )
;
; Note that only eax , ecx , and edx were used so no registers had to be saved
; and restored .
; ----------------------------------------------------------------------------

global _maxofthree

section
. text
_maxofthree
:
mov eax
, [ esp + 4 ]
mov ecx
, [ esp + 8 ]
mov edx
, [ esp + 12 ]
cmp eax
, ecx
cmovl eax
, ecx
cmp eax
, edx
cmovl eax
, edx
ret

Here is a C program that calls the assembly language function.

callmaxofthree.c
/*
* callmaxofthree.c
*
* Illustrates how to call the maxofthree function we wrote in assembly
* language.
*/


#include <stdio.h>

int maxofthree ( int , int , int );

int main () {
printf
( "%d/n" , maxofthree ( 1 , - 4 , - 7 ));
printf
( "%d/n" , maxofthree ( 2 , - 6 , 1 ));
printf
( "%d/n" , maxofthree ( 2 , 3 , 1 ));
printf
( "%d/n" , maxofthree (- 2 , 4 , 3 ));
printf
( "%d/n" , maxofthree ( 2 , - 6 , 5 ));
printf
( "%d/n" , maxofthree ( 2 , 4 , 6 ));
return 0 ;
}

To assemble, link and run this two-part program (on Windows):

    nasm -fwin32 maxofthree.asm
    gcc callmaxofthree.c maxofthree.obj
    a

Command Line Arguments

You know that in C, main is just a plain old function, and it has a couple parameters of its own:

    int main(int argc, char** argv)

Here is a program that uses this fact to simply echo the commandline arguments to a program, one per line:

echo.asm
;
 
----------------------------------------------------------------------------

; echo . asm
;
; NASM implementation of a program that displays its commandline arguments ,
; one per line .
; ----------------------------------------------------------------------------

global _main
extern _printf

section
. text
_main
:
mov ecx
, [ esp + 4 ] ; argc
mov edx
, [ esp + 8 ] ; argv
top
:
push ecx
; save registers that printf wastes
push edx

push dword
[ edx ] ; the argument string to display
push format
; the format string
call _printf
add esp
, 8 ; remove the two parameters

pop edx
; restore registers printf used
pop ecx

add edx
, 4 ; point to next argument
dec ecx
; count down
jnz top
; if not done counting keep going

ret
format
:
db
'%s' , 10 , 0

Note that as far as the C Library is concerned, command line arguments are always strings. If you want to treat them as integers, call atoi . Here's a neat program to compute xy .

power.asm
;
 
----------------------------------------------------------------------------

; power . asm
;
; Command line application to compute x ^ y
; Syntax : power x y
; x and y are integers
; ----------------------------------------------------------------------------

global _main
extern _atoi
extern _printf

section
. text
_main
:
push ebx
; save the registers that must be saved
push esi
push edi

mov eax
, [ esp + 16 ] ; argc ( it 's not at [esp+4] now :-))
cmp eax, 3 ; must have exactly two arguments
jne error1

mov ebx, [esp+20] ; argv
push dword [ebx+4] ; argv[1]
call _atoi
add esp, 4
mov esi, eax ; x in esi
push dword [ebx+8]
call _atoi ; argv[2]
add esp, 4
cmp eax, 0
jl error2
mov edi, eax ; y in edi

mov eax, 1 ; start with answer = 1
check:
test edi, edi ; we'
re counting y downto 0
jz gotit
; done
imul eax
, esi ; multiply in another x
dec edi
jmp check
gotit
: ; print report on success
push eax
push answer
call _printf
add esp
, 8
jmp
done
error1
: ; print error message
push badArgumentCount
call _printf
add esp
, 4
jmp
done
error2
: ; print error message
push negativeExponent
call _printf
add esp
, 4
done : ; restore saved registers
pop edi
pop esi
pop ebx
ret

answer
:
db
'%d' , 10 , 0
badArgumentCount
:
db
'Requires exactly two arguments' , 10 , 0
negativeExponent
:
db
'The exponent may not be negative' , 10 , 0

Floating Point Instructions

Here is an example that uses only two floating point instructions,fldz and fadd .

sum.asm
;
 
----------------------------------------------------------------------------

; sum . asm
;
; NASM implementation of a function that returns the sum of all the elements
; in a floating - point array . The function has prototype :
;
; double sum ( double [] array , int length )
; ----------------------------------------------------------------------------

global _sum

section
. text
_sum
:
mov edx
, [ esp + 4 ] ; address of argument
mov ecx
, [ esp + 8 ] ; length of array
fldz
; initialize the sum to 0
cmp ecx
, 0 ; guard against non - positive lengths !
jle
done
next :
fadd qword
[ edx ] ; add in the current array element
add edx
, 8 ; move to next array element
dec ecx
; count down
jnz
next ; if not done counting , continue
done :
ret
; return value already in st0

Data Sections

The text section is read-only on most operating systems, so you might find the need for a data section. On most operating systems, the data section is only for initialized data, and you have a special .bss section for uninitialized data. Here is a program that averages the command line arguments, expected to be integers, and displays the result as a floating point number. Note that there is no instruction to push an 8-byte value, so we fake it by manipulating esp.

average.asm
;
 
----------------------------------------------------------------------------

; average . asm
;
; NASM implementation of a program that treats all its command line arguments
; as integers , as displays their average as a floating point number . This
; program uses a data section to store intermediate results , not that it has
; to , but only to illustrate how data sections are used .
; ----------------------------------------------------------------------------

global _main
extern _printf
extern _atoi

section
. text
_main
:
mov ecx
, [ esp + 4 ] ; argc
dec ecx
; don 't count program name
jz nothingToAverage
mov [count], ecx ; save number of real arguments
mov edx, [esp+8] ; argv
accumulate:
push ecx ; save values across call to atoi
push edx
push dword [edx+ecx*4] ; argv[ecx]
call _atoi ; now eax has the int value of arg
add esp, 4
pop edx ; restore registers after atoi call
pop ecx
add [sum], eax ; accumulate sum as we go
dec ecx
jnz accumulate ; more arguments?
average:
fild dword [sum]
fild dword [count]
fdivp st1, st0 ; sum / count
sub esp, 8 ; make room for quotient on stack
fstp qword [esp] ; "push" quotient
push format ; push format string
call _printf
add esp, 12 ; 4 bytes format, 8 bytes number
ret

nothingToAverage:
push error
call _printf
add esp, 4
ret

section .data
count: dd 0
sum: dd 0
format: db '
%. 15f ', 10, 0
error: db '
There are no command line arguments to average ', 10, 0

Recursion

Perhaps surprisingly, there's nothing out of the ordinary required to implement recursive functions. You push parameters on the stack, after all! Here's an example. In C

    int factorial(int n) {
        return (n <= 1) ? 1 : n * factorial(n-1);
    }

In assembly language:

factorial.asm
;
 
----------------------------------------------------------------------------

; factorial . asm
;
; Illustration of a recursive function .
; ----------------------------------------------------------------------------

global _factorial

section
. text
_factorial
:
mov eax
, [ esp + 4 ] ; n
cmp eax
, 1 ; n <= 1
jnle L1
; if not , go do a recursive call
mov eax
, 1 ; otherwise return 1
jmp L2
L1
:
dec eax
; n - 1
push eax
; push argument
call _factorial
; do the call , result goes in eax
add esp
, 4 ; get rid of argument
imul eax
, [ esp + 4 ] ; n * factorial ( n - 1 )
L2
:
ret

SIMD Parallelism

The 64-bit MMX registers can do eight byte operations in parallel, or four (16-bit) word operations in parallel, or two (32-bit) doubleword operations in parallel. The 128-bit XMMs can do 16 byte, 8 word, or 4 doubleword operations in parallel, and do parallel floating-point computations too (4 single precision or 2 double precision). Here is a simple function that sums two arrays of 16-bit short ints, four at a time:

mmxarrayadd.asm
;
 
----------------------------------------------------------------------------

; mmxarrayadd . asm
;
; NASM implementation of a function that adds two short arrays .
;
; void add ( short a [], short b [], int n )
; ----------------------------------------------------------------------------

global _add

section
. text
_add
:
push ebx
; callee save register

mov eax
, [ esp + 8 ] ; eax points to a
mov edx
, [ esp + 12 ] ; edx points to b
mov ecx
, [ esp + 16 ] ; ecx <- number of items in each array
or ecx , ecx ; guard against negative lengths
js L4
L1
:
cmp ecx
, 4 ; Less than 4 items left ?
jl L2
; if so , handle them individually
movq mm0
, qword [ eax ] ; Get four items from a
paddw mm0
, qword [ edx ] ; Add them with next four items from b
movq qword
[ eax ], mm0 ; Write them back to a
add eax
, 8 ; Advance a to point to next 4 words
add edx
, 8 ; Advance b to point to next 4 words
sub ecx , 4 ; We 've just handled four
jmp L1
L2:
jecxz L4 ; Are there zero items left?
L3:
mov bx, word [eax] ; One word at a time addition
add bx, word [edx]
mov word [eax], bx
inc eax
inc eax
inc edx
inc edx
dec ecx
jnz L3
L4:
pop ebx
emms
ret

Here's another one

sseexample.asm
;
 
----------------------------------------------------------------------------

; sseexample . asm
;
; This program demonstrates a few SSE instructions , for no particular reason
; other than to show them off .
; ----------------------------------------------------------------------------

extern _printf
global _main

section
. text
_main
:
push esi
; callee save register

; Illustrate packed square root computations
movups xmm3
, [ x ]
sqrtps xmm0
, xmm3
movups
[ y ], xmm0
call printall

; Illustrate packed maximums
movups xmm2
, [ x ]
movups xmm5
, [ z ]
maxps xmm2
, xmm5
movups
[ y ], xmm2
call printall

; Done
pop esi
ret

printall
:
mov esi
, 4
printone
:
; Note printf will NOT ACCEPT single precision floats .
; We have to convert them to double precision floats . Sigh .
fld dword
[ y - 4 + esi * 4 ]
sub esp , 8
fstp qword
[ esp ]
push format
call _printf
add esp
, 12
dec esi
jnz printone
ret


section
. data
align
16
x dd
10.0
dd
100.0
dd
400.0
dd
653.2664
y dd
0.0
dd
0.0
dd
0.0
dd
0.0
z dd
5.0
dd
900.0
dd
316.20
dd
111.0
format db
'%15.7f' , 10 , 0

Saturated Arithmetic

This program illustrates saturated addition.

satexample.asm
;
 
----------------------------------------------------------------------------

; satexample . asm
;
; This is a short example of parallel saturated addition using paddsw .
; It takes two 64 - bit quantities
;
; 80008FFF0005FEF2
; 800020E07FFE99AA
;
; and performs saturated addition on the four 16 - bit blocks in parallel ,
; then writes the resulting value , in hex , to standard output . The answer
; should be
;
; 8000B0DF7FFF989C
; ----------------------------------------------------------------------------

extern _printf
global _main

section
. text
_main
:
movq mm0
, [ x ]
paddsw mm0
, [ y ] ; Do 4 saturated additions in parallel
movq
[ x ], mm0

push dword
[ x ] ; can 't push 64 bits at once
push dword [x+4] ; nor does printf handle 64-bit ints
push format
call _printf
add esp, 12
ret

section .data
x dw 0fef2h, 0005h, 8fffh, 8000h
y dw 099aah, 7ffeh, 20e0h, 8000h
format db '
% 0x % 0x ', 10, 0

Graphics

You probably the OpenGL graphics library already on your system, so why not call it from an assembly language program:

triangle.asm
;
 
----------------------------------------------------------------------------

; triangle . asm
;
; A very simple * Windows * OpenGL application using the GLUT library . It
; draws a nicely colored triangle in a top - level application window . One
; interesting thing is that the Windows GL and GLUT functions do NOT use the
; C calling convention ; instead they use the "stdcall" convention which is
; like C except that the callee pops the parameters .
; ----------------------------------------------------------------------------

global _main
extern _glClear@4
extern _glBegin@4
extern _glEnd@0
extern _glColor3f@12
extern _glVertex3f@12
extern _glFlush@0
extern _glutInit@8
extern _glutInitDisplayMode@4
extern _glutInitWindowPosition@8
extern _glutInitWindowSize@8
extern _glutCreateWindow@4
extern _glutDisplayFunc@4
extern _glutMainLoop@0

section
. text
title
: db 'A Simple Triangle' , 0
zero
: dd 0.0
one
: dd 1.0
half
: dd 0.5
neghalf
: dd - 0.5

display
:
push dword
16384
call _glClear@4
; glClear ( GL_COLOR_BUFFER_BIT )
push dword
9
call _glBegin@4
; glBegin ( GL_POLYGON )
push dword
0
push dword
0
push dword
[ one ]
call _glColor3f@12
; glColor3f ( 1 , 0 , 0 )
push dword
0
push dword
[ neghalf ]
push dword
[ neghalf ]
call _glVertex3f@12
; glVertex (-. 5 , -. 5 , 0 )
push dword
0
push dword
[ one ]
push dword
0
call _glColor3f@12
; glColor3f ( 0 , 1 , 0 )
push dword
0
push dword
[ neghalf ]
push dword
[ half ]
call _glVertex3f@12
; glVertex (. 5 , -. 5 , 0 )
push dword
[ one ]
push dword
0
push dword
0
call _glColor3f@12
; glColor3f ( 0 , 0 , 1 )
push dword
0
push dword
[ half ]
push dword
0
call _glVertex3f@12
; glVertex ( 0 , . 5 , 0 )
call _glEnd@0
; glEnd ()
call _glFlush@0
; glFlush ()
ret

_main
:
push dword
[ esp + 8 ] ; push argv
lea eax
, [ esp + 8 ] ; get addr of argc ( offset changed :-)
push eax
call _glutInit@8
; glutInit (& argc , argv )
push dword
0
call _glutInitDisplayMode@4
push dword
80
push dword
80
call _glutInitWindowPosition@8
push dword
300
push dword
400
call _glutInitWindowSize@8
push title
call _glutCreateWindow@4
push display
call _glutDisplayFunc@4
call _glutMainLoop@0
ret

Local Variables

After entering a function, we can reserve space for local variables by decrementing the stack pointer. For example, the C function

int example(int x, int y) {
  int a, b, c;
  b = 7;
  return x * b + y;
}

can be translated as follows:

_example:
	sub	esp, 12			; make room for 3 ints
	mov	dword [esp+4], 7	; b = 7
	mov	eax, [esp+16]		; x
	imul	eax, [esp+4]	        ; x * b
	add	eax, [esp+20]		; x * b + y
	ret

After "sub esp, 12 " the stack looks like:

                +---------+
         esp    |    a    |
                +---------+
         esp+4  |    b    |
                +---------+
         esp+8  |    c    |
                +---------+
         esp+12 | retaddr |
                +---------+
         esp+16 |    x    |
                +---------+
         esp+20 |    y    |
                +---------+

Stack Frames

Sometimes it is a real pain to try to keep track of the offsets of your parameters and local variables because the stack pointer keeps changing. For example, in

int example(int x, int y) {
  int a, b, c;
  ...
  f(y, a, b, b, x);
  ...
}

you cannot translate the function call as

	push	dword [esp+16]
	push	dword [esp+4]	; WRONG! b is really now at [esp+8]
	push	dword [esp+4]	; WRONG! b is really now at [esp+12]
	push	dword [esp]	; WRONG! a is really now at [esp+12]
	push	dword [esp+20]	; WRONG! y is really now at [esp+36]
	call	f

For this reason, many functions use the ebp register to index the "stack frame" of local variables and parameters, like this:

	push	ebp			; must save old ebp
	mov	ebp, esp		; point ebp to this frame
	sub	esp, ___		; make space for locals
	...
	mov	esp, ebp		; clean up locals
	pop	ebp			; restore old ebp
	ret

As long as you never change ebp throughout the function, all your local variables and parameters will always be at the same offset from ebp . The stack frame for our example function is now:

                +---------+
         ebp-12 |    a    |
                +---------+
         ebp-8  |    b    |
                +---------+
         ebp-4  |    c    |
                +---------+
         ebp    | old ebp |
                +---------+
         ebp+4  | retaddr |
                +---------+
         ebp+8  |    x    |
                +---------+
         ebp+12 |    y    |
                +---------+

http://www.cs.lmu.edu/~ray/notes/nasmexamples/
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值