Linux内存管理之slab分配器分析(一)

最新推荐文章于 2021-05-06 19:37:13 发布

尚先生的博客

最新推荐文章于 2021-05-06 19:37:13 发布

阅读量328

点赞数 1

分类专栏： Linux内存管理

原文链接：http://blog.chinaunix.net/uid-13746440-id-4759681.html

版权

Linux内存管理专栏收录该内容

49 篇文章 14 订阅

订阅专栏

Linux采用了slab来管理小块内存的分配与释放，Slab的提出基于以下两个考虑：

1. 内核函数经常倾向于反复请求相同的数据类型

2. 不同的结构使用不同的分配方法可以提高效率

3. 伙伴系统的频繁申请/释放影响效率，将释放内存放入缓冲区，直至超过阀值再归还给伙伴系统

4. 可以缓解内存碎片的产生，不过产生了内存浪费

Slab分为专用高速缓存（用于存放专用数据结构，skb/mm等）和普通高速缓存（普通的kmalloc之类）。所有高

速缓存区通过链表组织在一起，首节点是cache_chain。普通高速缓存分为(2^0)、(2^1)、(2^2)......，区域的个数

以及大小与系统内存配置以及PAGE_SIZE/L1_CACHE_BYTES/KMALLOC_MAX_SIZE等相关，具体在

linux/kmalloc_sizes.h中定义，每个大小对应两个高速缓存，一个是DMA高速缓存，一个是常规高速缓存。他们

存放在struct cache_sizes malloc_sizes[]中。

Slab把每一个请求的内存称之为对象，并将对象存放在slab中，具体的slab按照空、未满和全满放在不同的链表

中，全部连接至高速缓存中。如下

Slab分配器的相关数据结构：

static DEFINE_MUTEX(cache_chain_mutex);    // 保护相关的链表操作
static struct list_head cache_chain;       // 串联各kemem_cache

/*
 * struct array_cache
 *
 * Purpose:
 * - LIFO ordering, to hand out cache-warm objects from _alloc
 * - reduce the number of linked list operations
 * - reduce spinlock operations
 *
 * The limit is stored in the per-cpu structure to reduce the data cache
 * footprint.
 *
 */
struct array_cache {
    unsigned int avail;        // 当前空闲对象的位置，取值时objp = ac->entry[--ac->avail]
    unsigned int limit;        // 空闲对象上限
    unsigned int batchcount;   // 每次需要申请对象数量或释放的对象数量，和kmem_cache相同
    unsigned int touched;      // 如果从该组中分配了对象，就设置为1
    spinlock_t lock;           
    void *entry[];    /*       // 指向可用的slab对象指针数组
             * Must have this definition in here for the proper
             * alignment of array_cache. Also simplifies accessing
             * the entries.
             */
};

/*
 * The slab lists for all objects.
 */
struct kmem_list3 {
    struct list_head slabs_partial;      /* 部分已分配的链表    partial list first, better asm code */
    struct list_head slabs_full;         /× 已经全部分配的链表 ×/
    struct list_head slabs_free;         /× 尚未分配的链表 ×/
    unsigned long free_objects;          /× partial和free上可用的对象个数 ×/
    unsigned int free_limit;             /× 所有slab上允许未使用的对象最大数目 ×/
    unsigned int colour_next;            /* 下一个slab的颜色 Per-node cache coloring */
    spinlock_t list_lock;
    struct array_cache *shared;          /* 节点内共享 shared per node */
    struct array_cache **alien;          /* 在其他节点上 on other nodes */
    unsigned long next_reap;             /* 尝试两次缓存回收必须经过的时间间隔 updated without locking */
    int free_touched;                    /* 缓存是否活动的，0回收，1申请对象 updated without locking */
};

/*
 * struct slab
 *
 * Manages the objs in a slab. Placed either at the beginning of mem allocated
 * for a slab, or allocated from an general cache.
 * Slabs are chained into three list: fully used, partial, fully free slabs.
 */
struct slab {
    struct list_head list;            /× Slab所在的链表 ×/
    unsigned long colouroff;          /× Slab中第一个对象的偏移 ×/
    void *s_mem;                      /* 第一个对象的地址 including colour offset */
    unsigned int inuse;               /* 被使用的对象的数目 num of objs active in slab */
    kmem_bufctl_t free;               /× 下一个空闲对象的下标 ×/
    unsigned short nodeid;            /× NUMA ID，节点标识号 ×/
};

/*
 * struct kmem_cache
 *
 * manages a cache.
 */

struct kmem_cache {
/* 1) per-cpu data, touched during every alloc/free */
    struct array_cache *array[NR_CPUS];            /× Per_cpu结构，用于节点的申请与释放 ×/
/* 2) Cache tunables. Protected by cache_chain_mutex */
    unsigned int batchcount;                       /× Array中没有空闲对象时，每次为数组分配的对象数目 ×/
    unsigned int limit;                            /× Array中允许的最大对象数目，超出后会将batchcount个对象返回给slab ×/
    unsigned int shared;                           /× 是否共享CPU高速缓存 ×/

    unsigned int buffer_size;                      /* Slab中对象的大小，对象长度+填充字节 */
    u32 reciprocal_buffer_size;                    /× slab中对象大小的倒数，加快计算 ×/
/* 3) touched by every alloc & free from the backend */

    unsigned int flags;                            /* 高速缓存永久化的标志，当前只有CFLGS_OFF_SLAB，用于标识slab管理数据是在slab内还是外 */
    unsigned int num;                              /* 封装在一个单独slab中的对象的数目 # of objs per slab */

/* 4) cache_grow/shrink */
    /* order of pgs per slab (2^n) */
    unsigned int gfporder;                         /× 每个slab包含的页框数，取2为底的对数 ×/

    /* force GFP flags, e.g. GFP_DMA */
    gfp_t gfpflags;                                /* 分配页框时传递给伙伴系统的标志 */

    size_t colour;                                 /* 缓存的着色块个数 cache colouring range */
    unsigned int colour_off;                       /* Slab基本的对齐偏移，基本偏移量×着色块个数得到的绝对偏移量 colour offset */
    struct kmem_cache *slabp_cache;                /* Slab描述符在外部时使用，其指向的高速缓存用于存储描述符；在内部时为NULL */
    unsigned int slab_size;                        /× Slab管理区大小，包含slab对象和kmem_buffctl_t数组 ×/
    unsigned int dflags;                           /* 描述高速缓存的动态属性，目前没有使用 dynamic flags */

    /* constructor func */
    void (*ctor)(void *obj);                       /× 创造高速缓存时的构造函数指针 ×/

/* 5) cache creation/removal */
    const char *name;                              /× Cache的名字 ×/
    struct list_head next;                         /× 链接高速缓存的双向指针至cache_chain上 ×/

/* 6) statistics */
#ifdef CONFIG_DEBUG_SLAB
    unsigned long num_active;
    unsigned long num_allocations;
    unsigned long high_mark;
    unsigned long grown;
    unsigned long reaped;
    unsigned long errors;
    unsigned long max_freeable;
    unsigned long node_allocs;
    unsigned long node_frees;
    unsigned long node_overflow;
    atomic_t allochit;                                /× Cache的命中次数，在分配中更新 ×/
    atomic_t allocmiss;                               /× Cache的未命中次数，在分配中更新 ×/
    atomic_t freehit;
    atomic_t freemiss;

    /*
     * If debugging is enabled, then the allocator can add additional
     * fields and/or padding to every object. buffer_size contains the total
     * object size including these internal fields, the following two
     * variables contain the offset to the user object and its size.
     */
    int obj_offset;                                    /× 对象间的偏移 ×/
    int obj_size;                                      /× 对象本身的大小 ×/
#endif /* CONFIG_DEBUG_SLAB */

    /*
     * We put nodelists[] at the end of kmem_cache, because we want to size
     * this array to nr_node_ids slots instead of MAX_NUMNODES
     * (see kmem_cache_init())
     * We still use [MAX_NUMNODES] and not [1] or [0] because cache_cache
     * is statically defined, so we reserve the max number of nodes.
     */
    struct kmem_list3 *nodelists[MAX_NUMNODES];       /× 存放所有NUMA节点对应的相关数据，每个节点拥有自己的数据 ×/
    /*
     * Do not add fields after nodelists[]
     */
};

借鉴网友的图片，来加深下理解：

图基本说明了各结构之间的关联关系，具体申请和释放的过程，后面还会继续分析，结合图更容易理解。

另，下面一段摘抄自http://blog.csdn.net/hs794502825/article/details/7981524

每个slab首部都有一个小区域是不用的，称为“着色区”。着色区的大小使Slab中的每个对象的起始地址都按高速缓存中的缓存行（cache line）大小进行对齐（80386的一级高速缓存行大小为16字节，Pentium为32字节）。因为Slab是由1个页面或多个页面（最多为32）组成，因此每个Slab都是从一个页面边界开始的，它自然按高速缓存的缓冲行对齐。但是，Slab中的对象大小不确定，设置着色区的目的就是将Slab中第一个对象的起始地址往后推到与缓冲行对齐的位置。因为一个缓冲区中有多个Slab，因此，应该把每个缓冲区中的各个Slab着色区的大小尽量安排成不同的大小，这样可以使得在不同的Slab中，处于同一相对位置的对象，让它们在高速缓存中的起始地址相互错开，这样就可以改善高速缓存的存取效率。

每个Slab上最后一个对象以后也有个小小的废料区是不用的，这是对着色区大小的补偿，其大小取决于着色区的大小，以及Slab与其每个对象的相对大小。但该区域与着色区的总和对于同一种对象的各个Slab是个常数。

每个对象的大小基本上是所需数据结构的大小。只有当数据结构的大小不与高速缓存中的缓冲行对齐时，才增加若干字节使其对齐。所以，一个Slab上的所有对象的起始地址都必然是按高速缓存中的缓冲行对齐的。

与传统的内存管理模式相比， slab 缓存分配器提供了很多优点。首先，内核通常依赖于对小对象的分配，它们会在系统生命周期内进行无数次分配。slab 缓存分配器通过对类似大小的对象进行缓存而提供这种功能，从而避免了常见的碎片问题。slab 分配器还支持通用对象的初始化，从而避免了为同一目而对一个对象重复进行初始化。最后，slab 分配器还可以支持硬件缓存对齐和着色，这允许不同缓存中的对象占用相同的缓存行，从而提高缓存的利用率并获得更好的性能。

尚先生的博客

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Linux内存管理之slab分配器分析(一)

Linux采用了slab来管理小块内存的分配与释放，Slab的提出基于以下两个考虑：1.内核函数经常倾向于反复请求相同的数据类型2.不同的结构使用不同的分配方法可以提高效率3. 伙伴系统的频繁申请/释放影响效率，将释放内存放入缓冲区，直至超过阀值再归还给伙伴系统4. 可以缓解内存碎片的产生，不过产生了内存浪费Slab分为专用高速缓存（用于存放专用数据结构，skb/mm等）和普通高速缓存（普通的kmalloc之类）。所有高速缓存区通过链表组织在一起，首节点是cache_cha...
复制链接

扫一扫