Memory Management

Page Frame Management

Page Descriptors

The kernel must keep track of the status of each page frame.

/*

 * Each physical page in the system has a struct page associated with

 * it to keep track of whatever it is we are using the page for at the

 * moment. Note that we have no way to track which tasks are using

 * a page, though if it is a pagecache page, rmap structures can tell us

 * who is mapping it.

 */

 

All page descriptors are stored in mem_map array.

 

Several field of the page descriptor have different meanings, according to whether the page frame is free or what kernel component is using the page frame.

 

Flags describing the status of a page frame

 

Extended Paging:

Extended Paging allows page frames to be 4MB instead of 4KB in size; it is mainly used to translate large contiguous linear address ranges into corresponding physical ones.

 

 

NUMA

None Uniform Memory Access

Linux kernel supports NUMA by minimizing the number of accesses to costly nodes and selecting carefully where the data structures accessed by the CPU most often are stored.

 

Node descriptor  ----  pg_data_t

The physical memory inside each node can be split into several zones.

 

Memory Zones

Hardware constraints may limit the way page frames can be used.

1.       DMA processors for old ISA buses have a strong limit that they are able to address only the first 16M bytes of the RAM.

2.       The CPU may not be able to directly access the physical address because the linear address space is too small.

 

To cope with these two limitations, the Linux 2.6 partitions the physical memory of every node into three zones:

1.       ZONE_DMA (contain page frames lower than 16MB)

2.       ZONE_NORMAL (16MB – 896MB)

3.       ZONE_HIGHMEM ( > 896MB, always empty on 64-bit architecture)

Type --- zone

 

The Pool of Reserved Page Frames

Some Kernel should never block while requesting memory. This happens, for instance, when handling interrupts or when executing codes in a critical section.

Atomic memory allocation requests

The kernel reserves a pool of page frames for atomic memory allocation requests to be used only on low-on-memory conditions.

 

The Zoned Page Frame Allocator

 

Kernel Mappings of High-Memory Page Frames

Eg. __get_free_pages(GFP_HIGHMEM,0)

This function returns the linear address of the assigned page frame. However, pages in high-memory are not mapped into the kernel’s linear address space. Therefore, this function not only returns NULL, but also causes a severe problem that the page frame cannot be released by kernel because the kernel has lost track of it.

 

On 32-bit platforms, Linux adopts the following approach:

1.       The allocation of high-memory page frames is only done through alloc_pages().

2.       The last 128M linear address is dedicated to mapping high-memory page frames. The mapping is temporary and the linear address space is recycled.

 

Three mechanisms to map high-memory page frames:

1.       Permanent kernel mapping

2.       Temporary kernel mapping

3.       Noncontiguous memory allocation

 

Permanent Kernel Mapping

To establish long-lasting mappings of high-memory page frames into kernel address space, we use a dedicated page table called pkmap_page_table.

 

To keep track of the association between high-memory page frames and linear addresses, the kernel makes use of a hash table called page_address_htable which consists of entries of type page_address_map.

 

/*

 * Describes one page->virtual association

 */

struct page_address_map {

         struct page *page;

         void *virtual;

         struct list_head list;

};

 

The kmap() function establish a permanent kernel mapping.

 

Temporary Kernel Mapping

Requesting a temporary kernel mapping never blocks the current process, thus it could be invoked inside an interrupt handler or a deferrable function.

 

enum km_type {

KMAP_D(0)      KM_BOUNCE_READ,

KMAP_D(1)      KM_SKB_SUNRPC_DATA,

KMAP_D(2)      KM_SKB_DATA_SOFTIRQ,

KMAP_D(3)      KM_USER0,

KMAP_D(4)      KM_USER1,

KMAP_D(5)      KM_BIO_SRC_IRQ,

KMAP_D(6)      KM_BIO_DST_IRQ,

KMAP_D(7)      KM_PTE0,

KMAP_D(8)      KM_PTE1,

KMAP_D(9)      KM_IRQ0,

KMAP_D(10)    KM_IRQ1,

KMAP_D(11)    KM_SOFTIRQ0,

KMAP_D(12)    KM_SOFTIRQ1,

KMAP_D(13)    KM_SYNC_ICACHE,

KMAP_D(14)    KM_SYNC_DCACHE,

/* UML specific, for copy_*_user - used in do_op_one_page */

KMAP_D(15)    KM_UML_USERCOPY,

KMAP_D(16)    KM_IRQ_PTE,

KMAP_D(17)    KM_NMI,

KMAP_D(18)    KM_NMI_PTE,

KMAP_D(19)    KM_KDB,

/*

 * Remember to update debug_kmap_atomic() when adding new kmap types!

 */

KMAP_D(20)    KM_TYPE_NR

};

 

Index of a fix-mapped linear address.

enum fixed_addresses {

#define FIX_N_COLOURS 8

         FIX_CMAP_BEGIN,

#ifdef CONFIG_MIPS_MT_SMTC

         FIX_CMAP_END = FIX_CMAP_BEGIN + (FIX_N_COLOURS * NR_CPUS * 2),

#else

         FIX_CMAP_END = FIX_CMAP_BEGIN + (FIX_N_COLOURS * 2),

#endif

#ifdef CONFIG_HIGHMEM

         /* reserved pte's for temporary kernel mappings */

         FIX_KMAP_BEGIN = FIX_CMAP_END + 1,

         FIX_KMAP_END = FIX_KMAP_BEGIN+(KM_TYPE_NR*NR_CPUS)-1,

#endif

         __end_of_fixed_addresses

};

 

kmap_atomic()

knumap_atomic()

 

The Buddy System Algorithm

External fragmentation

 

Related Topics

http://en.wikipedia.org/wiki/Fragmentation_(computer)

 

Fragmentation:

è External fragmentation

è Internal fragmentation

è Data fragmentation

 

Fragmentation can be accepted in return for increase in speed or simplicity.

 

 

External fragmentation does the most damage to system.

 

Two ways to avoid external fragmentation

1.       Use the paging circuitry to map groups of noncontiguous page frames into intervals of contiguous linear address. This requires modifying Page Table frequently, resulting in higher memory access times.

2.       Develop a suitable technique to keep track of the existing free contiguous page frames, avoiding as much as possible the need to split up a large free block.

 

 

 

 

 

 

Memory Area Management

The Slab Allocator

The slab allocator groups objects into caches. Each cache is a “store” of objects of the same type.

 

 

Cache Descriptor

struct kmem_cache {

         /*cache flags*/

         /*slab size*/

         /*object size*/

         /*number of objects per slab*/

         struct kmem_list3 *nodelists[MAX_NUMNODES];

}

 

/*

 * The slab lists for all objects.

 */

struct kmem_list3 {

         struct list_head slabs_partial;         /* partial list first, better asm code */

         struct list_head slabs_full;

         struct list_head slabs_free;

         unsigned long free_objects;

         spinlock_t list_lock;

};

Slab Descriptor

 

 

kmem_cache_create

kmem_cache_destroy

 

/proc/slabinfo

 

Interfacing the slab allocator with the zoned page frame allocator

Kmem_getpages()

Kmem_freepages()

These two functions both rely on the zoned page frame allocator to obtain or release a group of contiguous page frames.

 

Allocating a Slab to a Cache

è A request has been issued to allocate a new object.

è The cache does not include a free object.

 

Releasing a Slab from a Cache

è There’re too many free objects in the slab cache.

è The timer invoked periodically determines that there are fully unused slabs that can be released.

 

Object Descriptor

/*

 * kmem_bufctl_t:

 *

 * Bufctl's are used for linking objs within a slab

 * linked offsets.

 *

 * This implementation relies on "struct page" for locating the cache &

 * slab an object belongs to.

 * This allows the bufctl structure to be small (one int), but limits

 * the number of objects a slab (not a cache) can contain when off-slab

 * bufctls are used. The limit is the size of the largest general cache

 * that does not use off-slab slabs.

 * For 32bit archs with 4 kB pages, is this 56.

 * This is not serious, as it is only for large objects, when it is unwise

 * to have too many per slab.

 * Note: This limit can be raised by introducing a general cache whose size

 * is less than 512 (PAGE_SIZE<<3), but greater than 256.

 */

 

typedef unsigned int kmem_bufctl_t;

#define BUFCTL_END      (((kmem_bufctl_t)(~0U))-0)

#define BUFCTL_FREE     (((kmem_bufctl_t)(~0U))-1)

#define     BUFCTL_ACTIVE      (((kmem_bufctl_t)(~0U))-2)

#define     SLAB_LIMIT     (((kmem_bufctl_t)(~0U))-3)

 

Slab descriptor and object descriptors are stored in two ways:

External: in the general cache

Internal: inside the slab

 

 

 

Slab Coloring

Goal: try to achieve the best possible performance of the microprocessor’s hardware cache

Policy: assign arbitrary numbers called colors to slabs -> change the offset of the first object in the slab

 

 

Memory Pools

Difference between memory pools and reserved page frames

Reserved page frames: only used to satisfy an atomic memory allocation request

Memory pools: allocate some dynamic memory to be used only on low-on-memory emergency

 

Type: mempool_t

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Questions

1.       Which policies does Linux adopt to avoid external and internal fragmentation?

External fragmentation: Buddy System Algorithm

Internal fragmentation: Slab allocator schema

 

 

 

转载于:https://my.oschina.net/u/158589/blog/59338

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值