pcasm book notes

1. ASCII vs. Unicode

ASCII use one byte to present a word, (0x41 for 'A');

Unicode use two bytes to present a word, (0x0041 for 'A')

Though ASCII could save space, but it's limited to present 256 characters the most, and Unicode can be used for all the languages.


2. Registers in 80386

80386: This CPU greatly enhanced the 80286. First, it extends many of
the registers to hold 32-bits (EAX, EBX, ECX, EDX, ESI, EDI, EBP,
ESP, EIP) and adds two new 16-bit registers FS and GS. It also adds
a new 32-bit protected mode. In this mode, it can access up to 4
gigabytes. Programs are again divided into segments, but now each
segment can also be up to 4 gigabytes in size!

AX is the lower 16bits of EAX, AL is lower 8 bits of AX, AH is higher 8 bits of AX (to compatible with 8086), also EBX...

EBP: Extended Base Pointer

ESP: Extended Stack Pointer

EIP:  Extended Instruction Pointer

EFLAGS: Information about results of last instruction

4 16-bits Segment registers, same as 8086:

CS: Code Segment

DS: Data Segment

SS: Stack Segment

ES: Extra Segment

2 more 16-bits extra segment register: FS, GS (Follow ES)


3. Directives

equ directive: symbol equ value

%define directive: %define SIZE 100  -> the same as #define SIZE 100

Data directive: L1 db 0; L2 dw 1000; L7 resb 1; L10 db 'w', 'o', 'r', 'd', 0;

mov [L6], 1; store a 1 at L6
This statement produces an operation size not specified error. Why?
Because the assembler does not know whether to store the 1 as a byte, word
or double word. To fix this, add a size specifier:
mov 6 dword [L6], 1; store a 1 at L6

4. Programing

; initialized data is put in the .data segment
segment .data

; uninitialized data is put in the .bss segment;
segment .bss
; code is put in the .text segment
segment .text

The global directive tells the assembler to make the asm main
label global. Unlike in C, labels have internal scope by default. This means
that only code in the same module can use the label. The global directive
gives the specified label (or labels) external scope.

5.

MOVZX. This instruction has two operands.
The destination (first operand) must be a 16 or 32 bit register. The source
(second operand) may be an 8 or 16 bit register or a byte or word of memory.
The other restriction is that the destination must be larger than the source.
(Most instructions require the source and destination to be the same size.)
Here are some examples:

movzx ebx, ax ; extended ax into ebx

6.variable storage

global These variables are defined outside of any function and are stored
at fixed memory locations (in the data or bss segments) and exist
from the beginning of the program until the end. By default, they can
be accessed from any function in the program; however, if they are
declared as static, only the functions in the same module can access
them (i.e. in assembly terms, the label is internal, not external).
static These are local variables of a function that are declared static.
(Unfortunately, C uses the keyword static for two different purposes!)
These variables are also stored at fixed memory locations (in data or
bss), but can only be directly accessed in the functions they are defined
in.
automatic This is the default type for a C variable defined inside a func-
tion. This variables are allocated on the stack when the function they
are defined in is invoked and are deallocated when the function returns.
Thus, they do not have fixed memory locations.
register This keyword asks the compiler to use a register for the data in
this variable. This is just a request. The compiler does not have to
honor it. If the address of the variable is used anywhere in the program
it will not be honored (since registers do not have addresses). Also,
only simple integral types can be register values. Structured types
can not be; they would not fit in a register! C compilers will often
automatically make normal automatic variables into register variables
without any hint from the programmer.
volatile This keyword tells the compiler that the value of the variable may
change any moment. This means that the compiler can not make any
assumptions about when the variable is modified. Often a compiler
might store the value of a variable in a register temporarily and use
the register in place of the variable in a section of code. It can not
do these types of optimizations with volatile variables. A common
example of a volatile variable would be one could be altered by two
threads of a multi-threaded program. Consider the following code:
x = 10;
y = 20;
z = x;
If x could be altered by another thread, it is possible that the other
thread changes x between lines 1 and 3 so that z would not be 10.
However, if the x was not declared volatile, the compiler might assume
that x is unchanged and set z to 10.
Another use of volatile is to keep the compiler from using a register
for a variable

7. Array

segment .data
; define array of 10 double words initialized to 1,2,..,10
a1   dd   1, 2, 3, 4, 5, 6, 7, 8, 9, 10
; define array of 10 words initialized to 0
a2   dw  0, 0, 0, 0, 0, 0, 0, 0, 0, 0
; same as before using TIMES
a3   times 10 dw 0
; define array of bytes with 200 0’s and then a 100 1’s
a4   times 200 db 0
        times 100 db 1

segment .bss
; define an array of 10 uninitialized double words
a5  resd 10
; define an array of 100 uninitialized words
a6  resw 100

8. Use lea to calculate addresses of local variables

Finding the address of a label defined in the data or bss segments is
simple. Basically, the linker does this. However, calculating the address
of a local variable (or parameter) on the stack is not as straightforward.
However, this is a very common need when calling subroutines. Consider
the case of passing the address of a variable (let’s call it x) to a function
(let’s call it foo). If x is located at EBP − 8 on the stack, one cannot just
use:
mov
eax, ebp - 8
Why? The value that MOV stores into EAX must be computed by the as-
sembler (that is, it must in the end be a constant). However, there is an
instruction that does the desired calculation. It is called LEA (for Load Ef-
fective Address). The following would calculate the address of x and store
it into EAX:
lea
eax, [ebp - 8]
Now EAX holds the address of x and could be pushed on the stack when
calling function foo. Do not be confused, it looks like this instruction is
reading the data at [EBP−8]; however, this is not true. The LEA instruction
never reads memory! It only computes the address that would be read
by another instruction and stores this address in its first register operand.
Since it does not actually read any memory, no memory size designation
(e.g. dword) is needed or allowed.

9. rep movsb byte ptr es:[edi], byte ptr ds:[esi]

The combination of a LODSx and STOSx instruction is very common. In fact, this combination can be performed
by a single MOVSx string instruction. They could be replaced
with a single MOVSD instruction with the same effect. The only difference
would be that the EAX register would not be used at all in the loop.

ESI is used for reading and
EDI for writing. It is easy to remember this if one remembers that SI stands
for Source Index and DI stands for Destination Index. Next, notice that the
register that holds the data is fixed (either AL, AX or EAX). Finally, note
that the storing instructions use ES to detemine the segment to write to,
not DS.

The 80x86 family provides a special instruction prefix4 called REP that
can be used with the above string instructions. This prefix tells the CPU
to repeat the next string instruction a specified number of times. The ECX
register is used to count the iterations (just as for the LOOP instruction).
Using the REP prefix, the loop could be replaced with a single line:
rep movsd

So the meaning of that instruction is: Move string from data segment ESI to extra segment EDI by byte, repeat ECX times.







评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值