slab源码分析--setup_cpu_cache函数

最新推荐文章于 2024-06-29 00:44:42 发布

FreeeLinux

最新推荐文章于 2024-06-29 00:44:42 发布

阅读量1.7k

点赞数

分类专栏： Linux内核分析文章标签：缓存

本文链接：https://blog.csdn.net/FreeeLinux/article/details/54571251

版权

Linux内核分析专栏收录该内容

22 篇文章 3 订阅

订阅专栏

之前剖析过了 slab 的初始化，以及 kmem_cache_create() 函数，留下了一个 setup_cpu_cache() 函数没有处理，今天来分析一下。

说明：本文缓存器指 kmem_cache 结构，slab 三链即 kmem_list3。

setup_cpu_cache() 函数和 slab 分配器的初始化状态是息息相关的。我们知道，slab 分配器初始化会经历以下状态：

g_cpucache_up状态	含义
NONE	AC和三链缓存器都没创建好，仍使用静态替代
PARTIAL_AC	本地缓存的arraycache_init结构体缓存器构造完毕
PARTIAL_L3	三链的kmem_list3结构体缓存器构造完毕
FULL	所有grneral cache(通用缓存器)构造完毕

首先提一下 arraycache_init 结构体，之前都没说过。

/*
 * bootstrap: The caches do not work without cpuarrays anymore, but the
 * cpuarrays are allocated from the generic caches...
 */
#define BOOT_CPUCACHE_ENTRIES   1
struct arraycache_init {
    struct array_cache cache;
    void *entries[BOOT_CPUCACHE_ENTRIES];
};

就是上面那样的，由于 array_cache 结构体末尾是一个柔性数组，我们需要把该柔性数组和 array_cache 包装起来，因为它们组合而成了本地缓存。否则单独的 array_cache 结构体是不会包含 entries 数组的，这是柔性数组的特性，它只是一个占位符。所以，本地缓存的缓存器真正要缓存的对象是 arrarcache_init 结构体。该结构体在初始化前期，采用静态初始化，如 BOOT_CPUACHE_ENTRIES。

下面来主要谈一下初始化过程的步骤，这是在 kmem_cache_init() 函数之中进行的：

(1) 构建好了kmem_cache实例cache_cache(静态分配)，且构建好了kmem_cache的slab分配器,并由initkmem_list3[0]组织, 相应的array为initarray_cache；
(2) 构建好了kmem_cache实例（管理arraycache_init），且构建好了arraycache_init的slab分配器,并由initkmem_list3[1]组织,相应的array为initarray_generic；
(3) 构建好了kmem_cache实例（管理kmem_list3）,此时还未构建好kmem_list3的slab分配器，但是一旦申请sizeof(kmem_list3)空间，将构建kmem_list3分配器,并由initkmem_list[2]组织,其array将通过kmalloc进行申请；
(4) 为malloc_sizes的相应数组元素构建kmem_cache实例，并分配kmem_list3,用于组织slab链表，分配arraycache_init用于组织每CPU的同一个kmem_cache下的slab分配;
(5) 替换kmem_cache、malloc_sizes[INDEX_AC].cs_cachep下的arraycache_init实例；
(6) 替换kmem_cache、malloc_sizes[INDEX_AC].cs_cachep、malloc_sizes[INDEX_L3].cs_cachep下的kmem_list3实例;
(7) g_cpucachep_up = EARLY;

问题：

为什么需要 initarray_cache 和 initarray_generic 两个静态 arraycache_init？它们静态初始化的内容不是一样的吗？

因为 initarray_cache 是为 cache_cache 缓存器准备的本地缓存，而 initarray_generic 是为 arraycache_init 缓存器准备的本地缓存。虽然静态初始化一样，它们最终要被 kmalloc 申请的新内容替换掉，分别作为不同缓存器的本地缓存。显然是不能共用的。

下面对 kmem_cache_init() 函数中执行 kmem_cache_create() 函数逐步分析（因为 setup_cpu_cache() 函数就是在后者中调用的）。

先声明：

#define INDEX_AC index_of(sizeof(struct arraycache_init))
#define INDEX_L3 index_of(sizeof(struct kmem_list3))

INDEX_AC 和 INDEX_L3 分别是 arraycache_init 和三链的大小，用于在 malloc_sizes[] 表中进行查找。

首先第一次调用：为 arraycache_init 构造缓存器。

sizes[INDEX_AC].cs_cachep =    kmem_cache_create(names[INDEX_AC].name,
                    sizes[INDEX_AC].cs_size,
                    ARCH_KMALLOC_MINALIGN,
                    ARCH_KMALLOC_FLAGS|SLAB_PANIC,   //#define ARCH_KMALLOC_FLAGS SLAB_HWCACHE_ALIGN，已经对齐过的标记
                    NULL, NULL);

kmem_cache_create() 函数尾部调用 setup_cpu_cache() 进入该分支：

    //如果程序执行到这里，那就说明当前还在初始化阶段
    //g_cpucache_up记录初始化的进度，比如PARTIAL_AC表示 struct array_cache 的 cache 已经创建
    //PARTIAL_L3 表示struct kmem_list3 所在的 cache 已经创建，注意创建这两个 cache 的先后顺序。在初始化阶段只需配置主cpu的local cache和slab三链
    //若g_cpucache_up 为 NONE，说明 sizeof(struct array)大小的 cache 还没有创建，初始化阶段创建 sizeof(struct array) 大小的cache 时进入这流程
    //此时 struct arraycache_init 所在的 general cache 还未创建，只能使用静态分配的全局变量 initarray_eneric 表示的 local cache
    if (g_cpucache_up == NONE) {
        /*
         * Note: the first kmem_cache_create must create the cache
         * that's used by kmalloc(24), otherwise the creation of
         * further caches will BUG().
         */
        cachep->array[smp_processor_id()] = &initarray_generic.cache; //arraycache_init的缓存器还没有创建，先使用静态的

        /*
         * If the cache that's used by kmalloc(sizeof(kmem_list3)) is
         * the first cache, then we need to set up all its list3s,
         * otherwise the creation of further caches will BUG().
         */
         //chuangjian struct kmem_list3 所在的cache是在struct array_cache所在cache之后
         //所以此时 struct kmem_list3 所在的 cache 也一定没有创建，也需要使用全局变量 initkmem_list3

         //#define SIZE_AC 1，第一次把arraycache_init的缓存器和initkmem_list3[1]关联起来
         //下一次会填充
        set_up_list3s(cachep, SIZE_AC);  

        //执行到这里struct array_cache所在的 cache 创建完毕，
        //如果struct kmem_list3和struct array_cache 的大小一样大，那么就不用再重复创建了，g_cpucache_up表示的进度更进一步
        if (INDEX_AC == INDEX_L3) 
            g_cpucache_up = PARTIAL_L3;  //更新cpu up 状态
        else
            g_cpucache_up = PARTIAL_AC;
}

第一次调用kmem_cache_create，填充了initkmem_list3[0],该类链表上挂载了kmem_cache类型的slab分配器.

kmem_cache_create() 中会第一次调用setup_cpu_cache，initkmem_list3[1]将被分配给与arraycache_init匹配的kmem_cache，但是由于arraycache_init的slab分配器（三链）还未构建好，因此，在第一次申请sizeof(arraycache_init)空间时，会把arraycache_init的slab 分配器挂入initkmem_list3[1]类的链表下.

第二次：为 kmem_list3（三链）构造缓存器


    if (INDEX_AC != INDEX_L3) {
    //如果struct kmem_list3 和 struct arraycache_init对应的kmalloc size索引不同，即大小属于不同的级别，
    //则创建struct kmem_list3所用的cache，否则共用一个cache
        sizes[INDEX_L3].cs_cachep =
            kmem_cache_create(names[INDEX_L3].name,
                sizes[INDEX_L3].cs_size,
                ARCH_KMALLOC_MINALIGN,
                ARCH_KMALLOC_FLAGS|SLAB_PANIC,
                NULL, NULL);
    }

setup_cpu_cache() 函数进入该分支：

else {
        //g_cache_up至少为PARTIAL_AC时进入这流程，struct arraycache_init所在的general cache已经建立起来，可以通筸kalloc分配了。
        cachep->array[smp_processor_id()] =
            kmalloc(sizeof(struct arraycache_init), GFP_KERNEL);

        //struct kmem_list3 所在的cache仍未创建完毕，还需使用全局的slab三链
        if (g_cpucache_up == PARTIAL_AC) {
            set_up_list3s(cachep, SIZE_L3);
            g_cpucache_up = PARTIAL_L3;
    }

第二次调用kmem_cache_create，填充了initkmem_list3[1],该类链表上挂载了 arraycache_init类型的slab分配器.

这已是第二次调用kmem_cache_create.在第二次调用时，arraycache_init的kmem_cache已初始化，但是arraycache_init的slab分配器（三链）还未构建好（相当于都为空）,而setup_cpu_cache中将开始通过kmalloc申请sizeof(arraycache_init)空间**，此时将同kmem_cache分配器初始化过程一样，填充arraycache_init分配器.主要区被在于kmem_cache_create最后调用setup_cpu_cache，setup_cpu_cache中将设置g_cpucache_up，以标志初始化的不同阶段.

这时有一句：

slab_early_init = 0;

此时我们已经做到了：

构建好了kmem_cache实例cache_cache，且构建好了kmem_cache的slab分配器,并由initkmem_list3[0]组织, 相应的array为initarray_cache.
构建好了kmem_cache实例（管理arraycache_init），且构建好了arraycache_init的slab分配器,并由initkmem_list3[1]组织,相应的array为initarray_generic.
构建好了kmem_cache实例（管理kmem_list3）,此时还未构建好kmem_list3的slab分配器，但是一旦申请sizeof(kmem_list3)空间，将构建kmem_list3分配器,并由initkmem_list[2]组织,其array将通过kmalloc进行申请. 此时，所有的包括前两步中的三链都是由静态的 kmem_list3组织，不过已经足以创建其他大小的缓存器了。

第三次调用：创建其他大小缓存缓存器
开始为malloc_sizes中的其它空间大小够将kmem_cache实例.如下将是第 3 次调用seup_cpu_cache，因为arraycache_init和kmem_list3的kmem_cache已构造完成，因此将会通过kmalloc进行申请，而不会再使用静态的initarray_cache、initarray_generic、initkmem_list3等数据.

    //sizes->cs_size 初值为是malloc_sizes[0]，值应该是从32开始
    while (sizes->cs_size != ULONG_MAX) {  //循环创建kmalloc各级别的通用缓存器，ULONG_MAX 是最大值，
        /*
         * For performance, all the general caches are L1 aligned.
         * This should be particularly beneficial on SMP boxes, as it
         * eliminates(消除) "false sharing".
         * Note for systems short on memory removing the alignment will
         * allow tighter(紧的) packing of the smaller caches.
         */
        if (!sizes->cs_cachep) {   
            sizes->cs_cachep = kmem_cache_create(names->name,
                    sizes->cs_size,
                    ARCH_KMALLOC_MINALIGN,
                    ARCH_KMALLOC_FLAGS|SLAB_PANIC,
                    NULL, NULL);
        }
#ifdef CONFIG_ZONE_DMA   //如果配置DMA，那么为每个kmem_cache 分配两个，一个DMA，一个常规
        sizes->cs_dmacachep = kmem_cache_create(
                    names->name_dma,
                    sizes->cs_size,
                    ARCH_KMALLOC_MINALIGN,
                    ARCH_KMALLOC_FLAGS|SLAB_CACHE_DMA|
                        SLAB_PANIC,
                    NULL, NULL);
#endif
        sizes++;   //都是数组名，直接++，进行循环迭代，由小到大分配各个大小的general caches，最大为ULONG_MAX
        names++;
    }

调用的setup_cpu_cache() 是这样的：

    if (g_cpucache_up == NONE) {
        ... 由于不为NONE，走入下一分支
    } else {
        //g_cache_up至少为PARTIAL_AC时进入这流程，struct arraycache_init所在的general cache已经建立起来，可以通筸kalloc分配了。
        cachep->array[smp_processor_id()] =
            kmalloc(sizeof(struct arraycache_init), GFP_KERNEL);

        //struct kmem_list3 所在的cache仍未创建完毕，还需使用全局的slab三链
        if (g_cpucache_up == PARTIAL_AC) {
            set_up_list3s(cachep, SIZE_L3);
            g_cpucache_up = PARTIAL_L3;
        } else { 
        //能进入到这里说明struct kmem_list3所在的cache和struct array_cache所在的cache都已创建完毕，无需全局变量
            int node;
            for_each_online_node(node) {
                //通过kmalloc分配struct kmem_list3对象
                cachep->nodelists[node] =
                    kmalloc_node(sizeof(struct kmem_list3),
                        GFP_KERNEL, node);
                BUG_ON(!cachep->nodelists[node]);
                //初始化slab三链
                kmem_list3_init(cachep->nodelists[node]);
            }
        }
    }

注意这个函数是在 else 里面嵌套了 if-else，也就是说，创建其他通用的缓存器时，会直接执行 kmalloc 来分配 arraycache_init ，并由于此时 g_cpucache_up 已经为 PARTIAl_l3（因为之前第二步创建了三链的缓存器），所以它还会 kmalloc 所要创建的缓存器对应的三链并初始化。

所以，我们在 kmem_cache_init() 函数后期，替换所有的静态量时，就无需替换这些普通大小的通用缓存器的三链了。只需替换前两步所用到的三链即可。

kmem_cache_init() 函数中最后的替换如下，通过 kmalloc 申请内存，initarray_ache和initarray_generic, initkmem_list3[3]最终都会被替换掉：

/* 4) Replace the bootstrap head arrays */
    {
        struct array_cache *ptr;

        //现在要申请arraycache替换之前的initarray_cache
        ptr = kmalloc(sizeof(struct arraycache_init), GFP_KERNEL);  //GFP_KERNEL 可睡眠申请

        //关中断
        local_irq_disable();
        BUG_ON(cpu_cache_get(&cache_cache) != &initarray_cache.cache);
        memcpy(ptr, cpu_cache_get(&cache_cache),
               sizeof(struct arraycache_init));  //将cache_cache中per-cpu对应的array_cache拷贝到ptr
        /*
         * Do not assume that spinlocks can be initialized via memcpy:
         */
        spin_lock_init(&ptr->lock);

        cache_cache.array[smp_processor_id()] = ptr;  //再让它指向ptr?
        local_irq_enable();

        ptr = kmalloc(sizeof(struct arraycache_init), GFP_KERNEL);

        local_irq_disable();
        BUG_ON(cpu_cache_get(malloc_sizes[INDEX_AC].cs_cachep)
               != &initarray_generic.cache);
        memcpy(ptr, cpu_cache_get(malloc_sizes[INDEX_AC].cs_cachep),
               sizeof(struct arraycache_init));
        /*
         * Do not assume that spinlocks can be initialized via memcpy:
         */
        spin_lock_init(&ptr->lock);

        malloc_sizes[INDEX_AC].cs_cachep->array[smp_processor_id()] =
            ptr;
        local_irq_enable();
    }
    /* 5) Replace the bootstrap kmem_list3's */
    {
        int nid;

        /* Replace the static kmem_list3 structures for the boot cpu */
        init_list(&cache_cache, &initkmem_list3[CACHE_CACHE], node);

        for_each_online_node(nid) {
            init_list(malloc_sizes[INDEX_AC].cs_cachep,
                  &initkmem_list3[SIZE_AC + nid], nid);

            if (INDEX_AC != INDEX_L3) {
                init_list(malloc_sizes[INDEX_L3].cs_cachep,
                      &initkmem_list3[SIZE_L3 + nid], nid);
            }
        }
    }

    /* 6) resize the head arrays to their final sizes */
    {
        struct kmem_cache *cachep;
        mutex_lock(&cache_chain_mutex);
        list_for_each_entry(cachep, &cache_chain, next)
            if (enable_cpucache(cachep))
                BUG();
        mutex_unlock(&cache_chain_mutex);
    }

    /* Annotate slab for lockdep -- annotate the malloc caches */
    init_lock_keys();


    /* Done! */
    g_cpucache_up = FULL;

以上就是 kmem_cache_init()、kmem_cache_create()、setup_cpu_cache() 三者的对应关系，它们是靠枚举常量来进行识别处理的，初始化不同时期要设置不同的缓存。

setup_cpu_cache() 函数全貌如下：

static int __init_refok setup_cpu_cache(struct kmem_cache *cachep)
{
    //此时初始化已经完毕,直接使能local cache
    if (g_cpucache_up == FULL)    
        return enable_cpucache(cachep);

    //如果程序执行到这里，那就说明当前还在初始化阶段
    //g_cpucache_up记录初始化的进度，比如PARTIAL_AC表示 struct array_cache 的 cache 已经创建
    //PARTIAL_L3 表示struct kmem_list3 所在的 cache 已经创建，注意创建这两个 cache 的先后顺序。在初始化阶段只需配置主cpu的local cache和slab三链
    //若g_cpucache_up 为 NONE，说明 sizeof(struct array)大小的 cache 还没有创建，初始化阶段创建 sizeof(struct array) 大小的cache 时进入这流程
    //此时 struct arraycache_init 所在的 general cache 还未创建，只能使用静态分配的全局变量 initarray_eneric 表示的 local cache
    if (g_cpucache_up == NONE) {
        /*
         * Note: the first kmem_cache_create must create the cache
         * that's used by kmalloc(24), otherwise the creation of
         * further caches will BUG().
         */
        cachep->array[smp_processor_id()] = &initarray_generic.cache; //arraycache_init的缓存器还没有创建，先使用静态的

        /*
         * If the cache that's used by kmalloc(sizeof(kmem_list3)) is
         * the first cache, then we need to set up all its list3s,
         * otherwise the creation of further caches will BUG().
         */
         //chuangjian struct kmem_list3 所在的cache是在struct array_cache所在cache之后
         //所以此时 struct kmem_list3 所在的 cache 也一定没有创建，也需要使用全局变量 initkmem_list3

         //#define SIZE_AC 1，第一次把arraycache_init的缓存器和initkmem_list3[1]关联起来
         //下一次会填充
        set_up_list3s(cachep, SIZE_AC);  

        //执行到这里struct array_cache所在的 cache 创建完毕，
        //如果struct kmem_list3和struct array_cache 的大小一样大，那么就不用再重复创建了，g_cpucache_up表示的进度更进一步
        if (INDEX_AC == INDEX_L3) 
            g_cpucache_up = PARTIAL_L3;  //更新cpu up 状态
        else
            g_cpucache_up = PARTIAL_AC;
    } else {
        //g_cache_up至少为PARTIAL_AC时进入这流程，struct arraycache_init所在的general cache已经建立起来，可以通筸kalloc分配了。
        cachep->array[smp_processor_id()] =
            kmalloc(sizeof(struct arraycache_init), GFP_KERNEL);

        //struct kmem_list3 所在的cache仍未创建完毕，还需使用全局的slab三链
        if (g_cpucache_up == PARTIAL_AC) {
            set_up_list3s(cachep, SIZE_L3);
            g_cpucache_up = PARTIAL_L3;
        } else { 
        //能进入到这里说明struct kmem_list3所在的cache和struct array_cache所在的cache都已创建完毕，无需全局变量
            int node;
            for_each_online_node(node) {
                //通过kmalloc分配struct kmem_list3对象
                cachep->nodelists[node] =
                    kmalloc_node(sizeof(struct kmem_list3),
                        GFP_KERNEL, node);
                BUG_ON(!cachep->nodelists[node]);
                //初始化slab三链
                kmem_list3_init(cachep->nodelists[node]);
            }
        }
    }
    //FIXME: 计算回收时间
    cachep->nodelists[numa_node_id()]->next_reap =
            jiffies + REAPTIMEOUT_LIST3 +
            ((unsigned long)cachep) % REAPTIMEOUT_LIST3;

    //初始化ac的一些变量
    cpu_cache_get(cachep)->avail = 0;
    cpu_cache_get(cachep)->limit = BOOT_CPUCACHE_ENTRIES;
    cpu_cache_get(cachep)->batchcount = 1;
    cpu_cache_get(cachep)->touched = 0;
    cachep->batchcount = 1;
    cachep->limit = BOOT_CPUCACHE_ENTRIES;
    return 0;
}

注意在 g_cpuup_cache = FULL 时会调用这个函数 enable_cpucache() 。
该函数在 kmem_cache_init() 函数中也会调用，不过是直接调用如下：

    /* 6) resize the head arrays to their final sizes */
    {
        struct kmem_cache *cachep;
        mutex_lock(&cache_chain_mutex);
        list_for_each_entry(cachep, &cache_chain, next)
            if (enable_cpucache(cachep))
                BUG();
        mutex_unlock(&cache_chain_mutex);
    }
    ...

    /* Done! */
    g_cpucache_up = FULL;
    ...
}

调用完毕还才设置 g_cpucache_up 为 FULL，并且设置完了 kmem_cache_init() 函数中已经不会再调用 kmem_cache_create() 了，也就是它只会显示调用 enable_cpucache() 一次。由于使用 list_for_each_entyr(cachep, &cache_chain, next)，可知它是在遍历 cache_chain 链表，那么它是做什么呢？其实是在遍历每一个缓存器，并初始化每一个缓存器的本地缓存，本地共享缓存，三链。比如利用被缓存的对象确定本地缓存 limit，先前我们默认 limit 都是 1，现在就可以计算出来了，确定本地缓存数组的大小，然后重新 kmalloc 为其分配相应空间并替换。还有本地共享缓存的设置，有了本地共享缓存，还要修改三链的一些值。还有一些特点，比如某个 CPU 重启，需要申请新的本地缓存更新旧的本地缓存，都是在下面这一大堆函数中做的。

/* Called with cache_chain_mutex held always */
static int enable_cpucache(struct kmem_cache *cachep)
{
    int err;
    int limit, shared;

    /*
     * The head array serves three purposes:
     * - create a LIFO ordering, i.e. return objects that are cache-warm
     * - reduce the number of spinlock operations.
     * - reduce the number of linked list operations on the slab and
     *   bufctl chains: array operations are cheaper.
     * The numbers are guessed, we should auto-tune as described by
     * Bonwick.
     */
     //根据每个缓存器分配 对象 的大小计算  本地缓存!!!  中的对象数目上限
    if (cachep->buffer_size > 131072)
        limit = 1;
    else if (cachep->buffer_size > PAGE_SIZE)
        limit = 8;
    else if (cachep->buffer_size > 1024)
        limit = 24;
    else if (cachep->buffer_size > 256)
        limit = 54;
    else
        limit = 120;

    /*
     * CPU bound(有义务的) tasks (e.g. network routing) can exhibit(展览，展示) cpu bound
     * allocation behaviour: Most allocs on one cpu, most free operations    //大多数情况在本CPU申请缓存，在其他CPU释放缓存。(正解)
     * on another cpu. For these cases, an efficient object passing between
     * cpus is necessary. This is provided by a shared array. The array
     * replaces Bonwick's magazine layer.
     * On uniprocessor(单进程), it's functionally equivalent(相等的) (but less efficient)
     * to a larger limit. Thus disabled by default.   //单处理器默认是关闭的
    */
    shared = 0;
    //多核系统，设置本地共享缓存中对象数目
    if (cachep->buffer_size <= PAGE_SIZE && num_possible_cpus() > 1)
        shared = 8;   //设置为8

#if DEBUG
    /*
     * With debugging enabled, large batchcount lead to excessively long
     * periods with disabled local interrupts. Limit the batchcount
     */
    if (limit > 32)
        limit = 32;
#endif
    //配置本地缓存
    err = do_tune_cpucache(cachep, limit, (limit + 1) / 2, shared);
    if (err)
        printk(KERN_ERR "enable_cpucache failed for %s, error %d.\n",
               cachep->name, -err);
    return err;
}

该函数首先要根据每个缓存器缓存对象的大小来计算本地缓存的对象数目上限，本地共享缓存也一样。然后它会配置本地缓存，本地共享缓存，和三链，先看一个数据结构：

struct ccupdate_struct {
    struct kmem_cache *cachep;
    struct array_cache *new[NR_CPUS];
};

再看实际的函数：

/* Always called with the cache_chain_mutex held */
//配置本地缓存、本地共享缓存和三链
static int do_tune_cpucache(struct kmem_cache *cachep, int limit,
                int batchcount, int shared)
{
    struct ccupdate_struct *new;
    int i;

    //申请分配一个 ccupdate_struct 并清零，注意这里 g_cpucache_up == FULL 才到这里来的，所以可以用 kmalloc
    new = kzalloc(sizeof(*new), GFP_KERNEL);
    if (!new)
        return -ENOMEM;

    //为每个CPU分配新的array_cache对象
    for_each_online_cpu(i) {
        new->new[i] = alloc_arraycache(cpu_to_node(i), limit,
                        batchcount);
        if (!new->new[i]) {  //如果失败
            for (i--; i >= 0; i--)  //commit and rollback
                kfree(new->new[i]);
            kfree(new);
            return -ENOMEM;
        }
    }
    new->cachep = cachep;   //用 new 把旧的缓存器作为自己的成员，在on_each_cpu() 函数中方便更新旧的缓存器的本地缓存

    //用新的array_cache对象替换旧的array_cache对象，在支持CPU热插拔的系统上，离线CPU可能没有释放本地缓存，使用的仍是旧本地缓存
    //参见__kmem_cache_destroy()函数。虽然cpu up 时要重新配置本地缓存，也无济于事。
    //考虑下面的情景; 共有CPUA 和 CPUB，CPUB down后，destroy Cache X，由于此时CPUB 是down状态，
    //所以Cache X中的 CPUB 的本地缓存未释放，过一段时间后CPUB又启动了，更新 cache_chain 链中所有cache的本地缓存
    //但此时Cache X对象已经释放回 cache_cache中了，其CPUB 的本地缓存并未更新。又过了一段时间，系统需要创建新的cache，
    //将 Cache X对象分配出去，其CPUB 仍然是旧的本地缓存，需要进行更新
    on_each_cpu(do_ccupdate_local, (void *)new, 1, 1);  //调用了do_ccpudate_local函数，用新的替换旧的

    check_irq_on();
    cachep->batchcount = batchcount;
    cachep->limit = limit;
    cachep->shared = shared;

    for_each_online_cpu(i) {
        struct array_cache *ccold = new->new[i];
        if (!ccold)
            continue;
        spin_lock_irq(&cachep->nodelists[cpu_to_node(i)]->list_lock);
        //释放旧的本地缓存中的  对象  
        free_block(cachep, ccold->entry, ccold->avail, cpu_to_node(i));
        spin_unlock_irq(&cachep->nodelists[cpu_to_node(i)]->list_lock);
        //释放旧的array_cache
        kfree(ccold);
    }
    kfree(new);
    //初始化本地 共享 缓存和三链
    return alloc_kmemlist(cachep);
}

do_ccipdate_local() 函数如下：

//更新每个CPU的array_cache对象
static void do_ccupdate_local(void *info)
{
    struct ccupdate_struct *new = info;  //额，和libevent一样的强制转换，换C++肯定报错了:)
    struct array_cache *old;

    check_irq_off();
    //获得旧的本地缓存
    old = cpu_cache_get(new->cachep);

    //指向新的 array_cache 对象，new 是之前分配的本地缓存的引用
    new->cachep->array[smp_processor_id()] = new->new[smp_processor_id()];
    //保存旧的 array_cache 对象
    new->new[smp_processor_id()] = old;
}

alloc_kmemlist() 函数如下：

/*
 * This initializes kmem_list3 or resizes varioius caches for all nodes.
 */
 //初始化本地 共享 缓存和三链，初始化不会为三链分配slab
static int alloc_kmemlist(struct kmem_cache *cachep)
{
    int node;
    struct kmem_list3 *l3;
    struct array_cache *new_shared;
    struct array_cache **new_alien = NULL;

    for_each_online_node(node) {
        //NUMA相关
                if (use_alien_caches) {
                        new_alien = alloc_alien_cache(node, cachep->limit);
                        if (!new_alien)
                                goto fail;
                }

        new_shared = NULL;
        if (cachep->shared) {
            //如果支持shared，就分配本地共享缓存
            new_shared = alloc_arraycache(node,
                cachep->shared*cachep->batchcount,
                    0xbaadf00d);   //batchcount这么大，3131961357
            if (!new_shared) {
                free_alien_cache(new_alien);
                goto fail;
            }
        }

        //获得旧的三链
        l3 = cachep->nodelists[node];
        if (l3) {  //旧三链指针不为空，需要先释放旧的资源
            struct array_cache *shared = l3->shared;

            spin_lock_irq(&l3->list_lock);

            if (shared)  //释放旧的本地共享缓存
                free_block(cachep, shared->entry,
                        shared->avail, node);

            //指向新的本地共享缓存
            l3->shared = new_shared;
            if (!l3->alien) {
                l3->alien = new_alien;
                new_alien = NULL;
            }
            //计算缓存器中空闲对象的上限
            l3->free_limit = (1 + nr_cpus_node(node)) *
                    cachep->batchcount + cachep->num;
            spin_unlock_irq(&l3->list_lock);
            //释放旧的本地共享缓存和本地缓存
            kfree(shared);
            free_alien_cache(new_alien);
            continue;
        }
        //如果没有旧的三链，那就要分配一个新的三链
        l3 = kmalloc_node(sizeof(struct kmem_list3), GFP_KERNEL, node);
        if (!l3) {
            free_alien_cache(new_alien);
            kfree(new_shared);
            goto fail;
        }

        //初始化三链
        kmem_list3_init(l3);
        l3->next_reap = jiffies + REAPTIMEOUT_LIST3 +
                ((unsigned long)cachep) % REAPTIMEOUT_LIST3;
        l3->shared = new_shared;
        l3->alien = new_alien;
        l3->free_limit = (1 + nr_cpus_node(node)) *
                    cachep->batchcount + cachep->num;
        cachep->nodelists[node] = l3;
    }
    return 0;

fail:
    if (!cachep->next.next) {
        /* Cache is not active yet. Roll back what we did */
        node--;
        while (node >= 0) {
            if (cachep->nodelists[node]) {
                l3 = cachep->nodelists[node];

                kfree(l3->shared);
                free_alien_cache(l3->alien);
                kfree(l3);
                cachep->nodelists[node] = NULL;
            }
            node--;
        }
    }
    return -ENOMEM;
}

FreeeLinux

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
1
评论
slab源码分析--setup_cpu_cache函数

之前剖析过了 slab 的初始化，以及 kmem_cache_create() 函数，留下了一个 setup_cpu_cache() 函数没有处理，今天来分析一下。说明：本文缓存器指 kmem_cache 结构，slab 三链即 kmem_list3，又称 slab 分配器。setup_cpu_cache() 函数和 slab 分配器的初始化状态是息息相关的。我们知道，slab 分配器初始化会经历以
复制链接

扫一扫

专栏目录