Ch02 Memory Addressing
we offer details in this chapter on how 80 x 86 microprocessors address memory chips and how Linux uses the available addressing circuits.
2.1 Memory Addresses
Three kinds of addresses:
* Logical address: Included in the machine language instructions to specify the address of an operand or of an instruction
* Linear address(virtual address): A single 32-bit unsigned integer that can be used to address up to 4 GB that is, up to 4,294,967,296 memory cells. Linear addresses are usually represented in hexadecimal notation; their values range from 0x00000000 to 0xffffffff.
* Physical: Used to address memory cells in memory chips. They correspond to the electrical signals sent along the address pins of the microprocessor to the memory bus. Physical addresses are represented as 32-bit or 36-bit unsigned integers.
he Memory Management Unit (MMU) transforms a logical address into a linear address by means of a hardware circuit called a segmentation unit ; subsequently, a second hardware circuit called a paging unit transforms the linear address into a physical address
Logical address –Segmentaion UNIT–> Linear address –Paging UNIT–> Physical address
The dual Pentium, for instance, maintains a two-port arbiter at each chip entrance and requires that the two CPUs exchange synchronization messages before attempting to use the common bus. From the programming point of view, the arbiter is hidden because it is managed by hardware circuits.
2.2 Segmentation in Hardware
Starting with the 80286 model, Intel microprocessors perform address translation in two different ways called real mode and protected mode
Real mode exists mostly to maintain processor compatibility with older models and to allow the operating system to bootstrap
Segment Selectors and Segmentation Registers
A logical address consists of two parts: a segment identifier and an offset that specifies the relative address within the segment. The segment identifier is a 16-bit field called the Segment Selector
逻辑地址包含 segment identifier 和 一个偏移,这个segment identifier 16位,又被称作Segment Selecotr.
15-3 index
2 TI=Table Indicator
1-0 RPL = Requestor Privilege Level
To make it easy to retrieve segment selectors quickly, the processor provides segmentation registers whose only purpose is to hold Segment Selectors; these registers are called cs, ss, ds, es, fs, and gs.
- cs: code segment register, which points to a segment containing program instructions
- ss: stack segment, points to a segment containing the current program stack
- ds: data segment, points to a segment containing global and static data
The cs register has another important function: it includes a 2-bit field that specifies the Current Privilege Level (CPL) of the CPU. The value 0 denotes the highest privilege level, while the value 3 denotes the lowest one. Linux uses only levels 0 and 3, which are respectively called Kernel Mode and User Mode.
2.2.2 Segment Descriptors
Each segment is represented by an 8-byte Segment Descriptor that describes the segment characteristics. Segment Descriptors are stored either in the Global Descriptor Table (GDT ) or in the Local Descriptor Table(LDT).
The address and size of the GDT in main memory are contained in the gdtr control register, while the address and size of the currently used LDT are contained in the ldtr control register.
Code Segment Descriptor
Data Segment Descriptor
Task State Segment Descriptor (TSSD)
Local Descriptor Table Descriptor (LDTD)
2.2.3 Fast Access to Segment Descriptors
We recall that logical addresses consist of a 16-bit Segment Selector and a 32-bit Offset, and that segmentation registers store only the Segment Selector.
To speed up the translation of logical addresses into linear addresses, the 80 x 86 processor provides an additional nonprogrammable register
Each nonprogrammable register contains the 8-byte Segment Descriptor (described in the previous section) specified by the Segment Selector contained in the corresponding segmentation register.
每个不可编程的寄存器包含了 由存储在对应的segmentation register的Segment Selector指定的 8字节的 segment descriptor。
Every time a Segment Selector is loaded in a segmentation register, the corresponding Segment Descriptor is loaded from memory into the matching nonprogrammable CPU register.
每当 Segment Selector被加载到 Segmentation register,对应的 Segment Descriptor 就从内存中加载到 对应的 不可编程的CPU的寄存器中。
From then on, translations of logical addresses referring to that segment can be performed without accessing the GDT or LDT stored in main memory; the processor can refer only directly to the CPU register containing the Segment Descriptor.
For instance, if the GDT is at 0x00020000 (the value stored in the gdtr register) and the index specified by the Segment Selector is 2, the address of the corresponding Segment Descriptor is 0x00020000 + (2 x 8), or 0x00020010.
The first entry of the GDT is always set to 0. This ensures that logical addresses with a null Segment Selector will be considered invalid, thus causing a processor exception. The maximum number of Segment Descriptors that can be stored in the GDT is 8,191 (i.e., 2^13-1).
Segmentaion Unit
how a logical address is translated into a corresponding linear address.
- Examines the TI field of the Segment Selector to determine which Descriptor Table stores the Segment Descriptor.
- Computes the address of the Segment Descriptor from the index field of the Segment Selector. The index field is multiplied by 8 (the size of a Segment Descriptor), and the result is added to the content of the gdtr or ldtr register.
Adds the offset of the logical address to the Base field of the Segment Descriptor, thus obtaining the linear address.
Notice that, thanks to the nonprogrammable registers associated with the segmentation registers, the first two operations need to be performed only when a segmentation register has been changed.
2.3 Segmentation in Linux
segmentation can assign a different linear address space to each process, while paging can map the same linear address space into different physical address spaces.
All Linux processes running in User Mode use the same pair of segments to address instructions and data. These segments are called user code segment and user data segment.
all Linux processes running in Kernel Mode use the same pair of segments to address instructions and data: they are called kernel code segment and kernel data segment
The corresponding Segment Selectors are defined by the macros _ USER_CS, USER_DS, KERNEL_CS, and KERNEL_DS, respectively. To address the kernel code segment, for instance, the kernel just loads the value yielded by the _KERNEL_CS macro into the cs segmentation register.