Intel-Assembly-Manual-note

Just My Note For Intel Assembly Manual


The source article: The Intel Assembly Manual

In this article I will answer two questions:

  • what is real mode?
  • what is protected mode?

Real Mode

Architecture

Real mode is the oldest mode. in real mode, everything is 16 bits . Memory is accessed by a 20-bits controller, making the memory limit to 1 MB. Available memory over this limitation is useless in real mode.

Segments

In real mode, the memory is accessed by segments. Each pointer is described by a segment, which is a 16-bit memory address divided by 16, and a 16-bit offset, which describes how far from the segment we will go. So I will show you a simple example:

0000:0000 ;memory address -> 0
0001:0001 ;memory address -> 0x1 * 16 + 0x1
FFFF:0010 ;memory address -> Maximum available address, specifying more than 0010h results in wrapping around zero.

All segments have read/write/execute access from anywhere (Any program can read/write or execute code within any segments), this will cause a security problem. Any program can read from or write to any part of memory, including the part in which the OS resides. That is why a real mode OS is a single tasking OS and if one application crashes, you have to reboot.

Registers

Real mode registers are 16 bits, and they include:

  • Four generic purpose registers: AX, BX, CX, DX. The upper 8 bit part of them can be accessed as AH, BH, CH, DH and the lower part as AL, BL, CL, DL.
  • A register to hold the offset of the currently executing code: IP.
  • Four registers to be used as pointers: SI, DI, BP, SP. SP points to the end of the available stack memory (it cannot be used as an index like the rest). Each time we push something to the stack, SP decreases. On POP, SP increases. These registers have no 8 bit splits.
  • Four registers to contain segments: CS, holding always the segment of the currently executing code, DS,ES and SS. SS holds the segment of the stack memory, DS holds the segment of the data, and ES is an auxilliary register.

So the code is always executing at CS:IP, and stack is pointed by SS:SP.

The 386 CPU adds more registers, also accessible in real mode:

  • 32 bit extensions to the non segment registers: EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP, EIP.
  • Two more auxilliary segment registers, GS and FS.
  • 5 control registers, CR0, CR1, CR2, CR3, CR4.
  • 6 debug registers, DR0, DR1, DR2, DR3, DR6, DR7, used for hardware breakpoints.

DS is the default data segment, unless else is specified or if SP or BP are used:

ASM

mov ax,[100] ; gets value from DS:100
mov ax,[si] ; gets value from DS:SI
mov ax,[es:si] ; from ES:SI
; When BP or SP is used, SS is the default.

ESI, EDI, EBP and ESP can be used as pointers. If their high bits are not zero, then an exception occurs (unless you are in Unreal mode, discussed below).

When REP operations are storing data (movsb, stosw etc), then when DI is used as an index, ES is the default segment.

Models

Because of the segmented memory, different sets of programming models were created, which mostly resulted in incompatibilities between compilers and libraries. C pointers were described as near or far, depending on whether they included a segment or not:

  • The tiny model. Everything has to be included in a single segment (COM file). Pointers are near.
  • The small model. One segment for the code, one for the data. All pointers are near.
  • The medium model. One data segment, multiple code segments. Code pointers far, data pointers near.
  • The compact model. One code segment, multiple data segments. Code pointers near, data pointers far.
  • The large model. Multiple code and data segments, code and data pointers far. Single data structures still limited to 64KB.
  • The huge model. Multiple code and data segments, all pointers far.

Benefits

The only benefit in real mode is that you have DOS and BIOS functions available as software interrupts. Therefore, all techniques used by DOS extenders (which allowed applications to run in protected mode) involved temporarily switching to real mode to call DOS.

Here is a quick hello world in tiny model:

ASM

org 0x100 ; code starts at offset 100h
use16               ; use 16-bit code
mov ax,0900h
mov dx,Msg
int 21h
mov ax,4c00h
int 21h
Msg db "Hello World!$"

This very simple program calls two DOS functions. The first is function 9 (ah register) which accepts a pointer of the string to be written to the screen in DS:DX (DS already has the segment, it’s a com file). The second is function 4C, which terminates the program.

Here is the same application in EXE format:

ASM

Shrink ▲

FORMAT MZ               ; DOS 16-bit EXE format ENTRY CODE16:Main       ; Specify Entry point (i.e. the start address) STACK STACK16:stackdata ; Specify The Stack Segment and Size     
SEGMENT CODE16_2 USE16  ; Declare a 16-bit segment     
    ShowMsg:
        mov ax,DATA16
        mov ds,ax            ; Load DS with our "default data segment"
        mov ax,0900h    
        mov dx,Msg    
        int 21h;            ; Call a DOS function: AX = 0900h (Show Message), 
                            ; DS:DX = address of a buffer, int 21h = show message 
    retf                    ; FAR return; we were called from 
                            ; another segment so we must pop IP and CS.
    
SEGMENT CODE16 USE16         ; Declare a 16-bit segment     ORG 0                    ; Says that the offset of the first opcode                               ; of this segment must be 0.
    
    Main:
        mov ax,CODE16_2
        mov es,ax
        call far [es:ShowMsg] ; Call a procedure in another segment.
                              ; CS/IP are pushed to the stack.
        mov ax,4c00h          ; Call a DOS function: AX = 4c00h (Exit), int 21h = exit
        int 21h
    
SEGMENT DATA16 USE16     Msg db "Hello World!$"
        
SEGMENT STACK USE16     stackdata dw 0 dup(1024)  ; use 2048 bytes as stack. When program is initialized, 
                              ; SS and SP are automatically set.

How does the assembler know the actual value of the data16, code16, code16_2, and stack16 segments? It doesn’t. What it does is to put null values, and then creates entries to the EXE file (known as “relocations”) so the loader, once it copies the code to the memory, writes to the specified address, the true values of the segments. And because this relocation map has a header, COM files cannot have multiple segments even if they sum to less than 64KB in total.

This program calls a function ShowMsg in another segment via a far call, which uses a DOS function (09h, INT 21h) to display text.

Problems

  • If multiple applications are running, one application can overwrite any other without any notification.
  • Up to 1MB memory only, and the upper 384K were used by BIOS, so only 640K available.
  • Mixing far and near pointers between applications and libraries led to incompatibities and, usually, crashes.
  • If something wrong happens, the PC has to reboot.

Segmented Protected Mode

Architecture

Protected Mode solves the Real Mode problems. In Particular:

  • Memory up to 16 MB(286) and 4 GB(386) is directly accessible.
  • Memory access will be checked, protections and protection level are available.
  • If something wrong happens, the problem can be isolated and the rest of the applications are not affected.

Protected Mode introduces rings, that is, the level of authorization. There are four rings (Ring 0,1,2,3) , Ring 0 is the most authorized, but the Ring 3 is the less authorized. Code running in the lower privileged ring cannot access the code in the higher ring.

Memory

In protected mode, the segment in memory in not anymore fixed, it has any size, from 1 byte to 4 GB. Each segment has its own limitation (read, write, execute access) and its protection ring.

Registers

The same set of registers that exist in real mode are available. Also, every register can be used as an index, for example mov ax,[ebx] will work.

Problems

While you can access all the memory directly, there is still a lot of segmentation and slow task switching or slow movement between rings.

Additional Knowledge

Global Descriptor Table

The Global Descriptor Table (GDT) is a set of entries that describes all segments for the CPU. Each entry is 8 bytes long and has the following format:

BitsMeaning
0-15Limit low 16 bits
16-31Base low 16 bits
32-39Base medium 8 bits
40Ac
41RW
42DC
43Ex
44S
45-46Priv
47Pr
48-51Limit upper 4 bits
52-53Reserved (0)
54Sz
55Gr
56-63Base upper 8 bits
  • The base is a 32-bit value that indicates the physical memory that this segment starts at.

  • The limit is an 20- bit value indicating the length of the segment, depending on the Gr bit. If the Gr bit is 1, then the actual limit is the limit value 4096.

  • The Ex flag is 1, to indicate a code segment, or 0, to indicate a data segment.

  • The DC flag has different meaning, depending on the Ex flag:

    • For code segment (Ex = 1), if DC is 0 then the segment is non conforming. A non conforming segment can only be called from a segment with the same privilege level. If RW is 1 then the segment is conforming and can be also called from segments with higher privilege. For example, a ring 3 conforming segment can be called from a ring 2 segment.
    • For data segment (Ex = 0), if DC is 0 then the data segment expands up, else it expands down. For an expanding down segment, it starts from its limit and ends to its base, with the address going the reverse way. This flag was created so a stack segment could be easily expanded, but it is not used today.
  • The RW flag has different meaning, depending on the Ex flag:

    • For code segment (Ex = 1), if 0, then the segment is not readable. If 1, then the code segment is readable.

    • For data segment (Ex = 0), if 0, segment is read only, else read-write.

      Note that a code segment is not writable. However, because segment base addresses can overlap, you can create a writable data segment with the same base address and limit of a code segment.

  • The Pr indicates the current ring (00 to 11)

  • The Ac bit indicates access. The CPU sets this bit each time the segment is accessed, so the OS gets an idea how frequent is the access to the segment, so it knows if it can cache it to disk or not.

  • The S bit must be 1 for code and data segments, and 0 for system segments (see below).

  • The Pr bit can be set to 1 to indicate that the segment is present in memory. If the OS caches this segment to the disk, then it sets Pr to 0. Any attempt to access the removed segment causes an exception. The OS catches this exception, and reloads the segment to memory, setting Pr to 1 again.

  • The Sz bit can have two values:

    • 0, in which case the default for opcodes is 16-bit. The segment can still execute 32-bit commands (386+) by putting the 0x66 or 0x67 prefix to them.
  • 1 (386+), in which case the default for opcodes is 32-bit. The segment can still execute 16-bit commands by putting the 0x66 or 0x67 prefix to them.

In real mode, the segment registers (CS, DS, ES, SS, FS, GS) specify a real mode segment. And you can put anything to them, no matter where it points. And you can read and write and execute from that segment. In protected mode, these registers are loaded with selectors. The selectors are indices to the GDT and have the following format:

BitsMeaning
0-2RPL. Requested protection level, must be equal or lower to the segment PL.
20 to take the entry from GDT, 1 from the LDT (see below)
3-150-based index to the table.

In protected mode, you can’t just select random values to the segment registers like in real mode. You must put valid values or you will get an exception. The exception is the first entry in the GDT table, which is always set to 0. CPU does not read information from entry 0 and thus it is considered a “dummy” entry. This allows the programmer to put the 0 value to a segment register (DS, ES, FS, GS) without causing an exception.

The GDT is loaded to the CPU by executing the LDGT command, which points to a 6-byte array:

  • Bytes 0-1 contain the full length of the GDT, maximum 4KB => 4096 entries.
  • Bytes 2-5 contain the physical address of the first entry of the GDT, in memory.

Interrupts

The interrupt table is now 8 bytes long for each defined interrupt, having the following structure:

ASM

struc IDT_STR 
{
 .ofs0_15 dw ofs0_15
 .sel dw sel
 .zero db zero
 .flags db flags            ; 0 P,1-2 DPL, 3-7 index to the GDT
 .ofs16_31 dw ofs16_31
}

Each interrupt also has a protection level. The LIDT command has the same functionality as in real mode, pointing to an 6 byte array (containing the size and the physical location of the first entry).

After the LIDT command is executed, real mode interrupts no longer work, so a real mode debugger is useless.

Local Descriptor Table

Local Descriptor Table (LDT) is a method for each application, on multitasking scenarios, to have a private set of segments, loaded with the LLDT assembly instruction. The LDT bit in the selector specifies if the segment loaded is from the GDT or from the LDT.

System Segments in the GDT

When the S bit in the GDT is 0, this indicates a system-related segment. In this case, GDT entries describe three kinds of system segments:

  • Task Segments
  • Call Gates
  • Interrupt Gates
  • Trap Gates (same as interrupt gates, with the exception that when a trap occurs, interrupts are still enabled)

Bits 40-43 in a GDT entry have the following meaning:

  • 0000 - Reserved
  • 0001 - Available 16-bit TSS
  • 0010 - Local Descriptor Table (LDT)
  • 0011 - Busy 16-bit TSS
  • 0100 - 16-bit Call Gate
  • 0101 - Task Gate
  • 0110 - 16-bit Interrupt Gate
  • 0111 - 16-bit Trap Gate
  • 1000 - Reserved
  • 1001 - Available 32-bit TSS
  • 1010 - Reserved
  • 1011 - Busy 32-bit TSS
  • 1100 - 32-bit Call Gate
  • 1101 - Reserved
  • 1110 - 32-bit Interrupt Gate
  • 1111 - 32-bit Trap Gate

Call Gates

Call gates are a mechanism to switch from a low privilege code to a higher one, used for user-level code to call system-level code. You specify a 1100 type entry in the GDT with the following format:

Hide Copy Code

C++

struct CALLGATE
{
    unsigned short offs0_15;
    unsigned short selector;
    unsinged short argnum:5;  // number of arguments to copy to the stack from the current stack
    unsigned char r:3; // Reserved
    unsigned char type:5; // 1100
    unsigned char dpl:2; // DPL of this gate
    unsigned char P:1; // Present bit
    unsigned short offs16_31;

};

Using CALL FAR with the selector of this callgate (the offset is ignored) will switch to the gate and execute the higher level privilege commands. If argnum specifies parameters to be copied, the system copies them to the new stack after pushing SS,ESP,CS,EIP. Using RETF will return from the gate call.

Call gates are slow mechanisms to transit between rings in the CPU.

TSS Descriptors, Task Gates and Hardware Multitasking

Having the ability to hold Task Segments in the GDT and Local Descriptor Tables, CPUs provide the ability for task switching. The Task State Segment is where the CPU saves information about a local task (the current registers). Executing a far JMP or a CALL (offsets are ignored like in call gates) with a selector pointing to a GDT TSS will “switch” to that task, restoring saved registers. The TSS descriptor is used to specify the base address and limit of the TSS to be used to load the new CPU state from. The CPU has a register named Task Register which tells which TSS will receive the old CPU state. When the TR register is loaded with an LTR instruction the CPU looks at the GDT entry (specified with LTR) and loads the visible part of TR with the GDT entry, and the hidden part with the base and limit of the GDT entry. When the CPU state is saved the hidden part of TR is used.

In addition to the far call and jmp, a context switch can be triggered by a using a Task Gate Descriptor. Unlike TSS Descriptors, task-gate descriptors can be in the GDT, LDT or IDT (so you can force a task switching when an interrupt occurs).

  • 21
    点赞
  • 28
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值