Introduction to X86-64 Assembly Programming


Project 3: Introduction to X86-64 Assembly Programming
Grading Form
Goal
In this project you will write programs in x86-64 assembly language. It is important that you learn the x86-64 assembly language since it is the one you use every day in your PC, Mac, or in data.cs.purdue.edu. Also, this is a 64 bit architecture that uses 8 byte addresses and variable-length assembly instructions.
X86-64 Introduction
The X86-64 assembly language was created by AMD and then adopted by Intel. The X86-64 assembly language extends the x86 32 bit architecture to 64 bits. X86-64 is a superset of x86-32. It provides an incremental evolution to migrate from x86-32 bits to x86-64 bits and it is backward compatible.
Here is a good reference that can help you in programming with the X86-64. Section 3.2 on page 5 gives an example of a C program translated to X86-64, and Figure 2 on page 7 gives an explanation of each register (which is also shown below).
x86-64 tutorial
Also see the x86-64 assembly notes section in http://www.cs.purdue.edu/homes/cs250

The X86-64 architecture uses the following register assignment:

You will find many similarities with the ARM architecture. For instance, there are also 16 registers available to the user, though not all are typically used as general purpose registers. As opposed to using r0, r1, and so on, some registers are named due to backwards-compatibility: instead of receiving/passing arguments via r0, r1, r2, and r3, arguments are received and passed via %rdi, %rsi, %rdx, and %rcx. Some registers are callee saved and can be used as local variables. In addition, %rax is used to return values in functions, similar to how you would load r0 in ARM assembly before returning.
One of the main differences is that in the x86-64 architecture is that the order of the arguments is different than in the ARM. For example, in the instruction:
movq $3, %rsi
The first argument, numerical constant $3, is assigned into the register %rsi so the target register is on the right. In ARM, this would be equivalent to:
mov r2 #3
The x86-64 architecture is backward compatible with x86-32, and as such the 4 least significant bytes of registers %rax, %rbx, %rcx, %dx are compatible with the old x86-32 bit registers %eax, %ebx, %ecx, and %edx. We will only write programs using the 64-bit registers, so most of the instructions will end with "q" which means that they will work with 8 byte words.
The addressing modes in the x86-64 are the following:
Immediate Value
movq $0x501208,%rdi
Direct Register Reference
movq %rax,%rdi
Indirect through a register
movq %rsi,(%rdi )
Direct Memory Reference
movq 0x501208,%rdi
#Put in register %rdi the constant 0x501208
#Move the contents of register %rax to %rdi
#Store the value in %rsi in the address contained in %rdi
#Fetch the contents in memory at address 0x501308 and store it in %rdi
Task 1: Your first X86-64 Assembly Program
Login to data.cs.purdue.edu and create a directory project-3-src where you will put all your code:
Type:
You will do all your work in data.cs.purdue.edu and in the directory ~/cs250/project-3-src.
Type the following program sqr.s that squares a number read from the terminal, then prints the result. Important note: it is easy to miss that your “main'' function will be declared with .globl in x86 as opposed to .global in ARM.
cd
mkdir -p cs250/project-3-src
cd cs250/project-3-src

# Define global variable a in data section
.data
.comm a,8
.text
format1:
.string "a="
format2:
.string "%ld"
format3:
.string "a^2 is %ld\n"
.globl main
main:
pushq %rbp
movq %rsp, %rbp
# long a;
movq $format1, %rdi #
movq $0, %rax #
call printf #
#
movq $format2, %rdi #
movq $a, %rsi #
movq $0, %rax #
call scanf #
#
movq $format3, %rdi #
movq $a, %rsi #
movq (%rsi),%rsi #
imulq %rsi,%rsi #
movq $0, %rax
call printf
leave ret
printf("a=");
similar to `bl printf` in ARM
scanf("%ld",&a);
printf("a^2 is %ld",a*a);
# pops the frame pointer #}
# main()
#
# Save frame pointer
#
#
# #
See what the program does line by line. You will find it very similar to ARM, such as how the argument registers are used (i.e. %rdi instead of r0, %rsi instead of r1 - library functions such as printf still need a format string and an address to put the value it scans in).
Then assemble it using the following command.
gcc -static -o sqr sqr.s

./sqr
The -static flag passed to gcc is used to make the .text section has a predefined loading address instead of being able to load it at random text addresses. The -static flags tells the compiler not to generate position independent code or PIC that is the default. One of the new security features of Linux is to load the program at random memory addresses every time you run it to make programs more difficult to hack. The flag -static will disable this feature in the assembly programs you write to make assembly programming easier.
Question 1. Write the code above into the file sqr.s, compile it and run it. Question 2. Explain what the following instructions do:
pushq %rbp # Save frame pointer movq %rsp, %rbp
......
leave
ret
Question 3.Write a program avg.s in assembly language that reads n numbers, then computes the sum and the average. The numbers are long integer of 8 bytes and the average will be truncated. Assume that the first number you read in will be the value of n, i.e. there will be n more inputs that follow.
./s
n? 5
3
6
8
1
5 SUM=23 AVG=4
Hint: Use the instruction "idivq" to compute the average. It might be helpful to look into commonly used X86-64 opcodes, as it may have some useful functionalities that ARM does not.
Task 2: Using Arrays in X86-64
The following C function computes the maximum of an array
// Finds the max value in an array
long maxarray(long n, long *a) {
long i=0;

long max = a[0];
while (i<n) {
if (max < a[i]) {
max = a[i]; }
i++; }
return max; }
The equivalent code in X86-64 assembly language is given here:
# maxarray.s
.text
.globl maxarray
maxarray:
pushq %rbp
movq %rsp, %rbp #
#
movq $0,%rdx #
movq (%rsi),%rax #
#
%rdx,%rdi #
jle afterw #
#
movq %rdx,%rcx #
imulq $8,%rcx #
while: cmpq
addq
cmpq jge movq
afterif: addq
jmp while
afterw: leave
ret
i=0 ;
max = a[0]
while (i<n) { // (n-i>0)
//*(long*)((8*i+(char*)a)
long *tmp = &a[i];
if (max < *tmp) { // (max-*tmp<0)
max = *tmp
# // Finds the max value in an array
#
# long maxarray(long n, long *a) # //n=%rdi a=%rsi
# //i=%rdx max=%rax #
%rsi,%rcx #
#
(%rcx),%rax #
afterif #
(%rcx),%rax #
$1,%rdx
#}
# i++ ; #
#}
#
#}
# Save frame pointer
The C code maxarray.c that calls this function is the following:
// maxarray.c:
#include <stdio.h>
long a[] = {4, 6, 3, 7, 9 };

extern long maxarray(long, long*);
int main() {
printf("maxarray(5,a)=%ld\n", maxarray(5,a));
}
To compile this programs type:
gcc -static -o maxarray maxarray.c maxarray.s
Question 4. Type the programs maxarray.s and maxarray.c, then test them. Also answer, what do the following instructions from the above code snippet do? Explain.
movq %rdx,%rcx
imulq $8,%rcx
addq %rsi,%rcx
Task 3: Implementing Bubble Sort in X86-64
Question 5. Implement a function bubblesort(long ascending, long n, long * a) in X86-64 in a file bubble.s that will sort an array of integers using bubblesort.
Here is the code in C that implements bubblesort. You have to implement it in X86-64 assembly language in the file bubble.s
void bubble_sort(long ascending, long n, long * a) { for (int i = 0; i < n - 1; i++) {
for (int j = 0; j < n - i - 1; j++) {
long swap = 0;
if (ascending) {
if (array[j+1] < array[j]) {
swap = 1;
} else {
if (array[j+1] > array[j]) {
swap = 1
} }
if (swap) {
long temp = array[j];
array[j] = array[j+1];
array[j+1] = temp;
}

} }
Then call it from the file bubble.c
// bubble.c:
#include <stdio.h>
long a[] = {6, 7, 2, 3, 1, 9, 4, 5, 0, -9, 8};
long n = (sizeof(a)/sizeof(long));
extern void bubblesort(long ascending, long n, long * a);
void printArray(long n, long * a) {
for (int i = 0; i < n; i++) {
printf("%ld ", a[i]);
}
printf("\n");
}
int main(int argc, char ** argv)
{
printf("Before Ascending:\n");printArray(n,a);
bubblesort(1, n, a); // notice how we do not return anything here...
printf("After Ascending:\n");printArray(n,a);
printf("Before Descending:\n");printArray(n,a);
bubblesort(0, n, a); // notice how we do not return anything here...
printf("After Descending:\n");printArray(n,a);
}
To compile the program type
gcc -static -o bubble bubble.c bubble.s
Question 6. Complete the following Makefile that will make all the executables in this lab.
You may have used Makefiles in past lab courses without knowing it! Create a file named “Makefile” in your lab directory, and put the following contents in it - Bash knows the keyword “make” and will search for a Makefile, attempting to compile the “goal:” line it finds. This is easier than compiling each individual file when working on large projects, and will not compile a file if it has not been edited since the last time it was compiled (you will learn more about this in CS252).

# TODO: Modify the below to compile bubble as well
goal: sqr maxarray
sqr: sqr.s
<TAB>gcc -static -o sqr sqr.s
maxarray: maxarray.s maxarray.c
<TAB>gcc -static -o maxarray maxarray.c maxarray.s
clean:
<TAB>rm -f sqr maxarray
To use the Makefile, type:
make clean
make
Turnin
Follow these instructions to turnin project-3:
Make sure that your programs are built by typing "make". Make sure it builds and runs in data.cs.purdue.edu etc.
If you have not created a project-3-src create it. Type:
cd
cd cs250
mkdir project-3-src
Copy your files into project-3-src/ and cd to the parent directory of
project-3-src. Then type:
turnin -c cs250 -p project-3 project-3-src
Then, you may type "turnin -c cs250 -p project-3 -v" to make sure you have submitted the correct files - remember the -v flag, or it will ask you if you wish to resubmit your lab again.
You will show your programs to the lab instructor and TAs during lab time next week.
Rubric:
Q1. __/10 points
Q2. __/10 points
Q3. __/30 points
Q4. __/10 points

Q5. __/30 points
Q6. __/10 points
Project 3 Grading Form
Question
Max
Current
Question 1. Write the code above into the file sqr.s, compile it and run it.
10
Question 2. Explain what the following instructions do:
pushq %rbp # Save frame pointer
movq %rsp, %rbp ......
leave
ret
10
Question 3.Write a program avg.s
30
Question 4. Type the programs maxarray.s and maxarray.c
10
Question 5. Implement a function bubblesort(long ascending, long n, long * a)
30
Question 6. Complete the following Makefile
10
Total: Max 100
 

《CUDA示例:通用GPU编程入门》是一本介绍使用CUDA编程的书籍。CUDA是一种通用计算架构,可以使开发者能够在GPU上执行复杂的并行计算任务。这本书通过大量的示例代码,介绍了如何使用CUDA来利用GPU的并行计算能力。 这本书首先介绍了GPU的工作原理和CUDA的基本概念,激发了读者对GPU编程的兴趣。然后,它详细介绍了CUDA的核心概念,包括线程、线程块和网格,以及CUDA内存模型。读者可以了解如何编写CUDA核函数,并了解如何在不同的线程间进行通信和同步。 随后,这本书通过一系列实际的示例代码,展示了如何使用CUDA来解决不同类型的问题。这些示例包括向量加法、矩阵乘法、图像处理等。每个示例都详细介绍了问题的背景、解决方案和实现细节。读者可以通过阅读这些示例代码,学习如何将问题转化为可在GPU上运行的并行计算任务,并了解如何优化GPU程序的性能。 此外,这本书还介绍了一些高级的CUDA主题,如共享内存、纹理内存和流式处理器等。这些主题可以帮助读者进一步扩展他们的GPU编程知识,并实现更复杂和高效的并行计算任务。 总之,《CUDA示例:通用GPU编程入门》是一本很好的介绍CUDA编程的书籍。它深入浅出地介绍了CUDA的基本概念和技术,通过丰富的示例代码,帮助读者从零开始学习并掌握CUDA编程。无论是初学者还是有一定CUDA编程经验的开发者,都可以从这本书中获得很多有价值的知识和经验。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值