glibc下malloc与free的实现原理（二）：malloc函数的实现

RC_diamond_GH

已于 2022-04-11 19:48:58 修改

阅读量4.5k

点赞数 5

分类专栏： pwn相关学习笔记文章标签： c语言安全学习

于 2022-04-02 22:54:53 首次发布

本文链接：https://blog.csdn.net/weixin_44215692/article/details/123930658

版权

pwn相关学习笔记专栏收录该内容

4 篇文章 4 订阅

订阅专栏

glibc下malloc与free的实现原理（二）：`malloc`函数的实现

文章目录

glibc下malloc与free的实现原理（二）：`malloc`函数的实现

一、概述

在libc中，实际上并没有名为malloc的函数，但是有含有malloc字样的函数，其中__libc_malloc就是我们调用malloc函数时直接调用的函数，在__libc_malloc函数的代码中，又调用了一个__int_malloc函数，这个函数是真正的核心部分，通过对这个函数的源代码的分析，我们能够得知malloc分配内存的具体行为、处理bin的顺序等信息。

本节会先分析__int_malloc函数，再去分析__libc_malloc函数，最终达到掌握malloc函数的原理的目的。

__int_malloc共有400行左右。

二、`__int_malloc`

源代码地址：https://github.com/iromise/glibc/blob/master/malloc/malloc.c#L3147

接下来，我们将分段讲解整个__int_malloc函数

提示：可以把展示的代码给复制到vscode里，一边对照代码一边阅读解说

0x00 变量定义、初始检查(3147 ~ 3185)

static void *_int_malloc(mstate av, size_t bytes) {
    INTERNAL_SIZE_T nb;  /* normalized request size */
    unsigned int    idx; /* associated bin index */
    mbinptr         bin; /* associated bin */

    mchunkptr       victim;       /* inspected/selected chunk */
    INTERNAL_SIZE_T size;         /* its size */
    int             victim_index; /* its bin index */

    mchunkptr     remainder;      /* remainder from a split */
    unsigned long remainder_size; /* its size */

    unsigned int block; /* bit map traverser */
    unsigned int bit;   /* bit map traverser */
    unsigned int map;   /* current word of binmap */

    mchunkptr fwd; /* misc temp for linking */
    mchunkptr bck; /* misc temp for linking */

    const char *errstr = NULL;

    /*
       Convert request size to internal form by adding SIZE_SZ bytes
       overhead plus possibly more to obtain necessary alignment and/or
       to obtain a size of at least MINSIZE, the smallest allocatable
       size. Also, checked_request2size traps (returning 0) request sizes
       that are so large that they wrap around zero when padded and
       aligned.
     */

    checked_request2size(bytes, nb);

    /* There are no usable arenas.  Fall back to sysmalloc to get a chunk from
       mmap.  */
    if (__glibc_unlikely(av == NULL)) {
        void *p = sysmalloc(nb, av);
        if (p != NULL) alloc_perturb(p, bytes);
        return p;
    }

nb是实际申请的内存大小，checked_request2size(bytes, nb);将申请的bytes转换成符合需求的nb
victim：最终victim会指向一个能够满足需求的chunk，然后函数最终返回victim指向的chunk的fd字段的地址

剩下的语句通过英文注释简单浏览即可，并不难以理解，当然，它们并不一定会全部用到。

0x01 尝试从`fastbins`中获取chunk(3187 ~ 3214)

程序会先尝试从fastbin中获取chunk，当nb处于fastbin支持的范围内，程序就会尝试从fastbin中获取chunk

对应源代码：

    /*
       If the size qualifies as a fastbin, first check corresponding bin.
       This code is safe to execute even if av is not yet initialized, so we
       can try it without checking, which saves some time on this fast path.
     */

    if ((unsigned long) (nb) <= (unsigned long) (get_max_fast())) {
        idx             = fastbin_index(nb);
        mfastbinptr *fb = &fastbin(av, idx);
        mchunkptr    pp = *fb;
        do {
            victim = pp;
            if (victim == NULL) break;
        } while ((pp = catomic_compare_and_exchange_val_acq(fb, victim->fd,
                                                            victim)) != victim);
        if (victim != 0) {
            if (__builtin_expect(fastbin_index(chunksize(victim)) != idx, 0)) {
                errstr = "malloc(): memory corruption (fast)";
            errout:
                malloc_printerr(check_action, errstr, chunk2mem(victim), av);
                return NULL;
            }
            check_remalloced_chunk(av, victim, nb);
            void *p = chunk2mem(victim);
            alloc_perturb(p, bytes);
            return p;
        }
    }

注意while后面的catomic_compare_and_exchange_val_acq，这个宏函数作用如下：

譬如catomic_compare_and_exchange_val_acq(mem, newval, oldval)

假如mem指向的内容(*mem)与oldver相等，则令mem指向的内容变成newval，然后返回oldvar
if(*mem == oldver){
 *mem = newval;
 return oldver;
}

第一次执行到这个宏的时候：

首先观察执行这个宏之前各变量之间关系的示意图（其中箭头表示变量中存储的地址指向的内容）

fb指向bin，而bin指向一个具体的chunk（命名为chunk1）

victim和pp指向chunk1，victim->fd指向chunk2。

执行完毕这个宏后，fb指向的内容，也就是原本的bin，fastbinsY数组中的那一项，就会变成chunk2的地址，换句话说，bin指向了chunk2（它原本指向chunk1）

pp原本就和victim的字面量相等，且这个宏会“返回”victim的字面量，理想情况下，第一次执行到这个宏，while循环就会结束。执行完毕这个宏后的示意图如下：

在这里插入图片描述

可以看到，chunk1已经彻底脱离链表了。

（这就是为什么说fastbin是LIFO表）

循环结束后，victim的字面量只有两种可能：

为NULL（其字面量为0）
为chunk1的地址

如果是chunk1的地址，首先会检查chunk1的size是否与索引值匹配，通过检查后，会执行一个check_remalloced_chunk宏函数（地址：https://github.com/iromise/glibc/blob/master/malloc/malloc.c#L1959）

然后执行之前提到的chunk2mem宏函数，返回chunk1的fd字段的地址。

0x02 尝试从`smallbins`中获取chunk(3216 ~ 3248)

没有从fastbins中获取到chunk，且申请的内存大小符合smallbins的支持范围，就会尝试从smallbins中获取chunk

源代码：

    /*
       If a small request, check regular bin.  Since these "smallbins"
       hold one size each, no searching within bins is necessary.
       (For a large request, we need to wait until unsorted chunks are
       processed to find best fit. But for small ones, fits are exact
       anyway, so we can check now, which is faster.)
     */

    if (in_smallbin_range(nb)) {
        idx = smallbin_index(nb);
        bin = bin_at(av, idx);

        if ((victim = last(bin)) != bin) {
            if (victim == 0) /* initialization check */
                malloc_consolidate(av);
            else {
                bck = victim->bk;
                if (__glibc_unlikely(bck->fd != victim)) {
                    errstr = "malloc(): smallbin double linked list corrupted";
                    goto errout;
                }
                set_inuse_bit_at_offset(victim, nb);
                bin->bk = bck;
                bck->fd = bin;

                if (av != &main_arena) set_non_main_arena(victim);
                check_malloced_chunk(av, victim, nb);
                void *p = chunk2mem(victim);
                alloc_perturb(p, bytes);
                return p;
            }
        }
    }

第一次的检测实质上是bin -> bk != bin，这应该是检查这个bin是否为空

第三次的检测实质上在检测victim->bk->fd != victim，倘若之后有伪造chunk的需求，应该考虑到如何绕过这个检测

理想情况下，victim是bin->bk，也就是最先入bin的chunk（因此smallbin是FIFO表）

victim是chunk1，那么bck就是chunk2，程序会让bin的bk指向chunk2，让chunk2成为新的表头。然后让chunk2的fd指向bin，构造成了完整的链表

victim的一些标志位被修改后，返回其fd字段的地址。

注意：这里有malloc_consolidate，堆利用教程how2heap上的其中一项名为fastbin_dup_consolidate，或许与此有关。

链表示意图：

取出chunk1前：

在这里插入图片描述

取出chunk1后：

在这里插入图片描述

0x03 尝试从`largebins`中获取chunk的准备工作(3250 ~ 3264)

注意，这里的else的对应的if是上面的if (in_smallbin_range(nb))

也就是说，假如申请的内存处于smallbins支持的范围内，即使没有成功从smallbins中获取到chunk，也不会尝试从largebins中获取chunk

    /*
       If this is a large request, consolidate fastbins before continuing.
       While it might look excessive to kill all fastbins before
       even seeing if there is space available, this avoids
       fragmentation problems normally associated with fastbins.
       Also, in practice, programs tend to have runs of either small or
       large requests, but less often mixtures, so consolidation is not
       invoked all that often in most programs. And the programs that
       it is called frequently in otherwise tend to fragment.
     */

    else {
        idx = largebin_index(nb);
        if (have_fastchunks(av)) malloc_consolidate(av);
    }

注意：这里有malloc_consolidate

0x04 遍历`unsorted bin`

1. 开始遍历全表，并检查size(3266 ~ 3289)

    /*
       Process recently freed or remaindered chunks, taking one only if
       it is exact fit, or, if this a small request, the chunk is remainder from
       the most recent non-exact fit.  Place other traversed chunks in
       bins.  Note that this step is the only place in any routine where
       chunks are placed in bins.
       The outer loop here is needed because we might not realize until
       near the end of malloc that we should have consolidated, so must
       do so and retry. This happens at most once, and only when we would
       otherwise need to expand memory to service a "small" request.
     */

    for (;;) {
        int iters = 0;
        // walk from the unsorted head to end to find one chunk
        // First In First Out
        while ((victim = unsorted_chunks(av)->bk) != unsorted_chunks(av)) {
            bck = victim->bk;
            if (__builtin_expect(chunksize_nomask(victim) <= 2 * SIZE_SZ, 0) ||
                __builtin_expect(chunksize_nomask(victim) > av->system_mem, 0))
                malloc_printerr(check_action, "malloc(): memory corruption",
                                chunk2mem(victim), av);
            size = chunksize(victim);

注意这里遍历unsorted bin的方式，让victim等于unsorted bin的当前表头，在unsorted bin不为逻辑上的空表的情况下持续便利。

根据遍历方式，知道unsorted bin是FIFO表，同时也知道，这里会遍历整个unsorted bin

仍然将victim称为chunk1，bck称为chunk2，这里只会检查chunk size并判定是否报错。

注意这里的size，后面用到的很多判断依据都是这个size

2. 尝试从last_remainder中分割小chunk(3291 ~ 3322)

            /*
               If a small request, try to use last remainder if it is the
               only chunk in unsorted bin.  This helps promote locality for
               runs of consecutive small requests. This is the only
               exception to best-fit, and applies only when there is
               no exact fit for a small chunk.
             */

            if (in_smallbin_range(nb) && bck == unsorted_chunks(av) &&
                victim == av->last_remainder &&
                (unsigned long) (size) > (unsigned long) (nb + MINSIZE)) {
                /* split and reattach remainder */
                remainder_size          = size - nb;
                remainder               = chunk_at_offset(victim, nb);
                unsorted_chunks(av)->bk = unsorted_chunks(av)->fd = remainder;
                av->last_remainder                                = remainder;
                remainder->bk = remainder->fd = unsorted_chunks(av);
                if (!in_smallbin_range(remainder_size)) {
                    remainder->fd_nextsize = NULL;
                    remainder->bk_nextsize = NULL;
                }

                set_head(victim, nb | PREV_INUSE |
                                     (av != &main_arena ? NON_MAIN_ARENA : 0));
                set_head(remainder, remainder_size | PREV_INUSE);
                set_foot(remainder, remainder_size);

                check_malloced_chunk(av, victim, nb);
                void *p = chunk2mem(victim);
                alloc_perturb(p, bytes);
                return p;
            }

想要进行这项操作，需要满足四个条件：

实际申请的内存大小nb处于smallbins支持的范围内
unsorted bin中只有唯一的一个chunk
这唯一的一个chunk还得是last_remainder（该字段在malloc_state中有定义）
last_remainder的size还足够分配nb内存且剩余内存大于MINSIZE（要维护last_remainder的chunk结构）

然后程序会进行分割last_remainder的操作，包括创建新chunk的一些信息、转移原来last_remainder的chunk head、更新malloc_state中的last_remainder地址等操作。这里原原本本展示了“如何在内存中构造一个chunk”，对我们来说很有参考意义。

3. 移除当前元素与直接返回(3324 ~ 3337)

            /* remove from unsorted list */
            unsorted_chunks(av)->bk = bck;
            bck->fd                 = unsorted_chunks(av);

            /* Take now instead of binning if exact fit */

            if (size == nb) {
                set_inuse_bit_at_offset(victim, size);
                if (av != &main_arena) set_non_main_arena(victim);
                check_malloced_chunk(av, victim, nb);
                void *p = chunk2mem(victim);
                alloc_perturb(p, bytes);
                return p;
            }

这里的操作虽然简单，也容易理解，但是涉及到的相关操作很有参考意义，例如如何把一个free chunk变成allocated chunk

首先更改size字段中的标志位，然后直接返回fd字段的地址

4. 将当前元素放入对应的bin中(3339 ~ 3396)

            /* place chunk in bin */

            if (in_smallbin_range(size)) {
                victim_index = smallbin_index(size);
                bck          = bin_at(av, victim_index);
                fwd          = bck->fd;
            } else {
                victim_index = largebin_index(size);
                bck          = bin_at(av, victim_index);
                fwd          = bck->fd;

                /* maintain large bins in sorted order */
                if (fwd != bck) {
                    /* Or with inuse bit to speed comparisons */
                    size |= PREV_INUSE;
                    /* if smaller than smallest, bypass loop below */
                    assert(chunk_main_arena(bck->bk));
                    if ((unsigned long) (size) <
                        (unsigned long) chunksize_nomask(bck->bk)) {
                        fwd = bck;
                        bck = bck->bk;

                        victim->fd_nextsize = fwd->fd;
                        victim->bk_nextsize = fwd->fd->bk_nextsize;
                        fwd->fd->bk_nextsize =
                            victim->bk_nextsize->fd_nextsize = victim;
                    } else {
                        assert(chunk_main_arena(fwd));
                        while ((unsigned long) size < chunksize_nomask(fwd)) {
                            fwd = fwd->fd_nextsize;
                            assert(chunk_main_arena(fwd));
                        }

                        if ((unsigned long) size ==
                            (unsigned long) chunksize_nomask(fwd))
                            /* Always insert in the second position.  */
                            fwd = fwd->fd;
                        else {
                            victim->fd_nextsize              = fwd;
                            victim->bk_nextsize              = fwd->bk_nextsize;
                            fwd->bk_nextsize                 = victim;
                            victim->bk_nextsize->fd_nextsize = victim;
                        }
                        bck = fwd->bk;
                    }
                } else
                    victim->fd_nextsize = victim->bk_nextsize = victim;
            }

            mark_bin(av, victim_index);
            victim->bk = bck;
            victim->fd = fwd;
            fwd->bk    = victim;
            bck->fd    = victim;

#define MAX_ITERS 10000
            if (++iters >= MAX_ITERS) break;
        }

由于small bins中，每一个bin的存储大小是固定的，所以把chunk放入small bins的操作很简单，其过程如下：

在这里插入图片描述

其中，chunk1是victim插入前，bin->fd指向的chunk。

可见，插入smallbins时，就是从bin->fd方向进行插入，从smallbins中取出chunk时，也是从bin->fd方向取出，所以说smallbins是妥妥的FIFO表

我们重点分析在chunk的size满足large bins支持的范围内的时候，程序如何将这个chunk放入large bins。要知道，每一个large bin存储的chunk的size都未必相同，只是处于相同的大小区间内罢了。

重点分析将chunk放入`large bins`的过程：

（对应代码：展示的源代码中的第一个else代码块）

程序运行到if (fwd != bck) 前，数据结构布局如下：（当然，这是理想情况，也就是这个bin并不为空的情况下的结构示意图）

if(fwd != bck)就是判定bin是否为空，如果fwd == bck，就说明bin为空。bin为空的情况我们之后在解释，先解释bin不为空的情况：

在这里插入图片描述

程序首先比较bin->bk的size与victim的size，根据注释，我们得知，bin->bk被期望为是当前bin中最小的chunk

a. victim的size小于当前bin中最小chunk的size的情况

如果程序选择执行这个if块，if块执行完毕后结构布局就会变成下面这样：

在这里插入图片描述

最终会变成这样：

在这里插入图片描述

而在nextsize视图中是这样：

在这里插入图片描述

要明白这件事：目前图中展示的三个chunk里，victim的size最小。

另外，prev_biggest与prev_smallest展现出了极强的对称性，且与相关victim字段的关联并不冲突，因此，prev_biggest与prev_smallest是同一个chunk这个程序也能正确运行

b. victim的size不小于当前bin中最小chunk的size的情况

while循环结束后，fwd指向该bin中第一个size ≤ victim的size大的chunk（命名为sub_chunk1）

注意，这里遍历链表的时候，用的是fd_nextsize而不是fd，从此我们得知，一个large bin中，整个链表通过fd_nextsize遍历，其size会逐渐递减

在victim与的size与sub_chunk1的size相等时，会进行这样的处理：

在这里插入图片描述

可以看到，当victim与sub_chunk1的size相等时，程序最终会把victim在sub_chunk1的fd方向存入这个bin链表，且并不会去处理victim的两个nextsize字段。

当victim的size与sub_chunk1不相等时，就会开始对两个nextsize字段进行处理：

初始状态：

在这里插入图片描述

else块执行完毕后：

在这里插入图片描述

可以看到，这里sub_chunk1和sub_chunk2之间的双重链接被解除，victim被加入双向链表。

large bin中有两个相互独立且未必相同的链表，一条由fd和bk字段链接，另一条由fd_nextsize和bk_nextsize字段链接

最终，victim被加入sub_chunk和bk_chunk在fd-bk链接的双向链表之间。

c. bin为空的情况

在这里插入图片描述

总结：large bin的特点

在一个large bin中，有两条双向链表索引chunk，一条用fd和bk字段进行索引，一条用fd_nextsize和bk_nextsize字段索引。这里为了方便叙述，把用fd和bk字段索引的链表称为fd_bk链表，把用fd_nextsize和bk_nextsize字段索引的链表称为nextsize链表

对于fd_bk链表，一定可以在链表上遍历到这个bin中的所有chunk，而nextsize链表则未必可以链接到所有chunk

对于一个large bin中的chunk1，chunk1->fd，chunk1->bk，chunk->fd_nextsize，chunk->bk_nextsize，其size关系如下：
$(\mathrm{chunk1\to bk}).\mathrm{size}\le\mathrm{chunk1}.\mathrm{size}\le(\mathrm{chunk1\to fd}).\mathrm{size}\\ (\mathrm{chunk1\to bk\_nextsize}).\mathrm{size}<\mathrm{chunk1}.\mathrm{size}<(\mathrm{chunk1\to fd\_nextsize}).\mathrm{size}\\$
而bin的fd字段指向的chunk被期望为当前bin中size最大的chunk，bk字段指向的chunk被期望为当前bin中size最小的chunk

当bin中所有chunk的size均不同，fd_bk链表与nextsize链表基本相同，当bin中存在size相同的chunk，nextsize链表能链接的chunk数量会比fd-bk链表少

相同size的chunk在fd_bk链表中被期望是相邻的，对于这几个相同size的chunk，用bk字段遍历，遍历到的最后一个是nextsize的成员。

倒数第二行的if语句是在限制遍历unsorted bin的while循环的循环次数
（注意：并不是最外层的无限循环for(;;)，而是while ((victim = unsorted_chunks(av)->bk) != unsorted_chunks(av))的循环）也就是说，对unsorted bin的遍历到此结束了。

0x05 尝试从`largebins`中获取chunk(3398 ~ 3459)

        /*
           If a large request, scan through the chunks of current bin in
           sorted order to find smallest that fits.  Use the skip list for this.
         */

        if (!in_smallbin_range(nb)) {
            bin = bin_at(av, idx);

            /* skip scan if empty or largest chunk is too small */
            if ((victim = first(bin)) != bin &&
                (unsigned long) chunksize_nomask(victim) >=
                    (unsigned long) (nb)) {
                victim = victim->bk_nextsize;
                while (((unsigned long) (size = chunksize(victim)) <
                        (unsigned long) (nb)))
                    victim = victim->bk_nextsize;

                /* Avoid removing the first entry for a size so that the skip
                   list does not have to be rerouted.  */
                if (victim != last(bin) &&
                    chunksize_nomask(victim) == chunksize_nomask(victim->fd))
                    victim = victim->fd;

                remainder_size = size - nb;
                unlink(av, victim, bck, fwd);

                /* Exhaust */
                if (remainder_size < MINSIZE) {
                    set_inuse_bit_at_offset(victim, size);
                    if (av != &main_arena) set_non_main_arena(victim);
                }
                /* Split */
                else {
                    remainder = chunk_at_offset(victim, nb);
                    /* We cannot assume the unsorted list is empty and therefore
                       have to perform a complete insert here.  */
                    bck = unsorted_chunks(av);
                    fwd = bck->fd;
                    if (__glibc_unlikely(fwd->bk != bck)) {
                        errstr = "malloc(): corrupted unsorted chunks";
                        goto errout;
                    }
                    remainder->bk = bck;
                    remainder->fd = fwd;
                    bck->fd       = remainder;
                    fwd->bk       = remainder;
                    if (!in_smallbin_range(remainder_size)) {
                        remainder->fd_nextsize = NULL;
                        remainder->bk_nextsize = NULL;
                    }
                    set_head(victim,
                             nb | PREV_INUSE |
                                 (av != &main_arena ? NON_MAIN_ARENA : 0));
                    set_head(remainder, remainder_size | PREV_INUSE);
                    set_foot(remainder, remainder_size);
                }
                check_malloced_chunk(av, victim, nb);
                void *p = chunk2mem(victim);
                alloc_perturb(p, bytes);
                return p;
            }
        }

首先检查实际申请的内存大小nb是否处于large bin支持的范围内，如果是，则尝试从large bin中获取chunk
然后if块中检查两个条件：
- 该bin是否为空
- 该bin中最大的一个chunk的size是否 ≥ nb
倘若这两个条件都不满足，那么程序会放弃从large bin中获取chunk

然后程序会从nextsize中size最小的chunk开始，通过bk_nextsize字段遍历nextsize链表，找到nextsize链表中第一个size ≥ nb的chunk，这个chunk的地址储存在victim中

程序会检测victim在fd-bk链表上有无size相同的chunk，如果有，就让victim变成这个这个相同的chunk（结合我们之前揭示的large bin模型，作者应该是想避免转移nextsize字段的数据）

由于找到的这个victim，其size也可能大nb不少，那么可以考虑分割这个victim，这就是remainder_size = size - nb;的来历

然后程序会进行一个unlink操作，有关unlink操作的详情，请去查阅第四大块”辅助宏“中的相关内容。

unlink操作完毕后，victim就既不属于fd-bk链表也不属于nextsize链表了，

至此，我们可以放心操作victim了，它已经不再是任何链表的一部分。

接下来，考虑分割victim，如果remainder_size比MINSIZE还要小，那就没有分割的必要了，否则就要进行分割，主要做了以下几件事：

划定victim_remainder的范围
把victim_remainder加入unsorted bin
为victim_remainder构造相应的chunk信息
修改原先victim的size信息，令其满足nb的需求

最后，无论有没有分割victim，都会返回当前victim的fd字段地址，也就是成功从large bin中获取到了chunk

0x06 尝试从更大的bin中获取chunk(3461 ~ 3559)

如果走到了这里，那说明对于用户所需的 chunk，不能直接从其对应的合适的 bin 中获取 chunk，所以我们需要来查找比当前 bin 更大的 fast bin，small bin 或者 large bin。

        /*
           Search for a chunk by scanning bins, starting with next largest
           bin. This search is strictly by best-fit; i.e., the smallest
           (with ties going to approximately the least recently used) chunk
           that fits is selected.
           The bitmap avoids needing to check that most blocks are nonempty.
           The particular case of skipping all bins during warm-up phases
           when no chunks have been returned yet is faster than it might look.
         */

        ++idx;
        bin   = bin_at(av, idx);
        block = idx2block(idx);
        map   = av->binmap[ block ];
        bit   = idx2bit(idx);

        for (;;) {
            /* Skip rest of block if there are no more set bits in this block.
             */
            if (bit > map || bit == 0) {
                do {
                    if (++block >= BINMAPSIZE) /* out of bins */
                        goto use_top;
                } while ((map = av->binmap[ block ]) == 0);

                bin = bin_at(av, (block << BINMAPSHIFT));
                bit = 1;
            }

            /* Advance to bin with set bit. There must be one. */
            while ((bit & map) == 0) {
                bin = next_bin(bin);
                bit <<= 1;
                assert(bit != 0);
            }

            /* Inspect the bin. It is likely to be non-empty */
            victim = last(bin);

            /*  If a false alarm (empty bin), clear the bit. */
            if (victim == bin) {
                av->binmap[ block ] = map &= ~bit; /* Write through */
                bin                 = next_bin(bin);
                bit <<= 1;
            }

            else {
                size = chunksize(victim);

                /*  We know the first chunk in this bin is big enough to use. */
                assert((unsigned long) (size) >= (unsigned long) (nb));

                remainder_size = size - nb;

                /* unlink */
                unlink(av, victim, bck, fwd);

                /* Exhaust */
                if (remainder_size < MINSIZE) {
                    set_inuse_bit_at_offset(victim, size);
                    if (av != &main_arena) set_non_main_arena(victim);
                }

                /* Split */
                else {
                    remainder = chunk_at_offset(victim, nb);

                    /* We cannot assume the unsorted list is empty and therefore
                       have to perform a complete insert here.  */
                    bck = unsorted_chunks(av);
                    fwd = bck->fd;
                    if (__glibc_unlikely(fwd->bk != bck)) {
                        errstr = "malloc(): corrupted unsorted chunks 2";
                        goto errout;
                    }
                    remainder->bk = bck;
                    remainder->fd = fwd;
                    bck->fd       = remainder;
                    fwd->bk       = remainder;

                    /* advertise as last remainder */
                    if (in_smallbin_range(nb)) av->last_remainder = remainder;
                    if (!in_smallbin_range(remainder_size)) {
                        remainder->fd_nextsize = NULL;
                        remainder->bk_nextsize = NULL;
                    }
                    set_head(victim,
                             nb | PREV_INUSE |
                                 (av != &main_arena ? NON_MAIN_ARENA : 0));
                    set_head(remainder, remainder_size | PREV_INUSE);
                    set_foot(remainder, remainder_size);
                }
                check_malloced_chunk(av, victim, nb);
                void *p = chunk2mem(victim);
                alloc_perturb(p, bytes);
                return p;
            }
        }

注意++idx所处的代码层级，这里的idx，是程序没有成功从中取出chunk的那个bin的idx，可能属于smallbins，也可能属于largebins

++idx;后面的一些语句相关宏定义：( malloc.c文件中的1519 ~ 1524行)

#define BINMAPSHIFT 5
#define BITSPERMAP (1U << BINMAPSHIFT) //BITSPERMAP = 32
#define BINMAPSIZE (NBINS / BITSPERMAP)//BINMAPSIZE = 4

#define idx2block(i) ((i) >> BINMAPSHIFT)
#define idx2bit(i) ((1U << ((i) & ((1U << BINMAPSHIFT) - 1))))

这里的binmap，我们在前一节的malloc_state中是见过的，不过当时没有细说其作用，这段代码就是应用binmap的，所以在这里分析代码也就解释了binmap的作用

先重温binmap在malloc_state中的定义： unsigned int binmap[ BINMAPSIZE ];

通过我们在第一节中的分析，我们知道，bins数组实际上表示了：1个unsorted bin，31个smallbins，95个largebins，共127个bin，而binmap数组实际上只有4个元素，一个unsigned int变量占32个二进制位，所以binmap数组共有128个二进制位，如果说binmap数组与bin数组有关，那么一定是binmap的每一个二进制位对应一个bin，我们可以把它看作是一个boolean型数组，用来标记每一个对应的bin中是否有chunk

第一个if语句在逻辑上的含义为：当前map数字中是否存在位阶比bit更高的二进制位不为0，如果不存在，也就是bit > map成立，就说明当前map对应的一系列bins中不存在size比nb大的chunk

然后程序会判定更高的map中是否含有chunk，如果都没有的话，就去切割top chunk来获得返回的chunk。如果有，就获取这个map对应的一系列bin中序号最小的bin

然后程序会开始寻找这个map中第一个含有chunk的bin

总之，这个被找到的bin，其最后一个chunk会被分割，分割出来的部分返回给用户，剩余部分，如果还满足MINSIZE的话，就加入unsorted bin

0x07 尝试从top chunk中切割chunk(3561 ~ 3614)

    use_top:
        /*
           If large enough, split off the chunk bordering the end of memory
           (held in av->top). Note that this is in accord with the best-fit
           search rule.  In effect, av->top is treated as larger (and thus
           less well fitting) than any other available chunk since it can
           be extended to be as large as necessary (up to system
           limitations).
           We require that av->top always exists (i.e., has size >=
           MINSIZE) after initialization, so if it would otherwise be
           exhausted by current request, it is replenished. (The main
           reason for ensuring it exists is that we may need MINSIZE space
           to put in fenceposts in sysmalloc.)
         */

        victim = av->top;
        size   = chunksize(victim);

        if ((unsigned long) (size) >= (unsigned long) (nb + MINSIZE)) {
            remainder_size = size - nb;
            remainder      = chunk_at_offset(victim, nb);
            av->top        = remainder;
            set_head(victim, nb | PREV_INUSE |
                                 (av != &main_arena ? NON_MAIN_ARENA : 0));
            set_head(remainder, remainder_size | PREV_INUSE);

            check_malloced_chunk(av, victim, nb);
            void *p = chunk2mem(victim);
            alloc_perturb(p, bytes);
            return p;
        }

        /* When we are using atomic ops to free fast chunks we can get
           here for all block sizes.  */
        else if (have_fastchunks(av)) {
            malloc_consolidate(av);
            /* restore original bin index */
            if (in_smallbin_range(nb))
                idx = smallbin_index(nb);
            else
                idx = largebin_index(nb);
        }

        /*
           Otherwise, relay to handle system-dependent cases
         */
        else {
            void *p = sysmalloc(nb, av);
            if (p != NULL) alloc_perturb(p, bytes);
            return p;
        }
    }
}

这是获取chunk的最后手段，直接从top chunk中切割chunk出来。切割的流程与之前切割large bin中的chunk差不多

检查大小
更新malloc_state中的top chunk的地址
为切割后的top chunk构造相关数据

最后，如果这么多获取chunk的途径都没能获取到所需chunk，那么就会通过系统调用来申请新的chunk

思考题：

在什么情况下程序会通过系统调用来为用户分配内存？

三、`__libc_malloc`

源代码地址：https://github.com/iromise/glibc/blob/master/malloc/malloc.c#L2770

void *__libc_malloc(size_t bytes) {
    mstate ar_ptr;
    void * victim;

    void *(*hook)(size_t, const void *) = atomic_forced_read(__malloc_hook);
    if (__builtin_expect(hook != NULL, 0))
        return (*hook)(bytes, RETURN_ADDRESS(0));

    arena_get(ar_ptr, bytes);

    victim = _int_malloc(ar_ptr, bytes);
    /* Retry with another arena only if we were able to find a usable arena
       before.  */
    if (!victim && ar_ptr != NULL) {
        LIBC_PROBE(memory_malloc_retry, 1, bytes);
        ar_ptr = arena_get_retry(ar_ptr, bytes);
        victim = _int_malloc(ar_ptr, bytes);
    }

    if (ar_ptr != NULL) __libc_lock_unlock(ar_ptr->mutex);

    assert(!victim || chunk_is_mmapped(mem2chunk(victim)) ||
           ar_ptr == arena_for_chunk(mem2chunk(victim)));
    return victim;
}

可以看到，__libc_malloc实际上是为__int_malloc做了包装。虽然是包装，但是也含有一些重要代码。

我的评价是这部分参考CTF-wiki：

https://ctf-wiki.org/pwn/linux/user-mode/heap/ptmalloc2/implementation/malloc/#__libc_malloc

四、辅助函数

0x00 unlink

https://github.com/iromise/glibc/blob/master/malloc/malloc.c#L1346

本部分建议在理解[二、-> 0x04 -> 4. ]中总结的large bin模型后再来阅读

能够更方便理解。

/* Take a chunk off a bin list */
#define unlink(AV, P, BK, FD)                                                  \
    {                                                                          \
        FD = P->fd;                                                            \
        BK = P->bk;                                                            \
        if (__builtin_expect(FD->bk != P || BK->fd != P, 0))                   \
            malloc_printerr(check_action, "corrupted double-linked list", P,   \
                            AV);                                               \
        else {                                                                 \
            FD->bk = BK;                                                       \
            BK->fd = FD;                                                       \
            if (!in_smallbin_range(chunksize_nomask(P)) &&                     \
                __builtin_expect(P->fd_nextsize != NULL, 0)) {                 \
                if (__builtin_expect(P->fd_nextsize->bk_nextsize != P, 0) ||   \
                    __builtin_expect(P->bk_nextsize->fd_nextsize != P, 0))     \
                    malloc_printerr(                                           \
                        check_action,                                          \
                        "corrupted double-linked list (not small)", P, AV);    \
                if (FD->fd_nextsize == NULL) {                                 \
                    if (P->fd_nextsize == P)                                   \
                        FD->fd_nextsize = FD->bk_nextsize = FD;                \
                    else {                                                     \
                        FD->fd_nextsize             = P->fd_nextsize;          \
                        FD->bk_nextsize             = P->bk_nextsize;          \
                        P->fd_nextsize->bk_nextsize = FD;                      \
                        P->bk_nextsize->fd_nextsize = FD;                      \
                    }                                                          \
                } else {                                                       \
                    P->fd_nextsize->bk_nextsize = P->bk_nextsize;              \
                    P->bk_nextsize->fd_nextsize = P->fd_nextsize;              \
                }                                                              \
            }                                                                  \
        }                                                                      \
    }

注意这里的检查，实际上是在检查P->fd->bk是否是P，以及P->bk->fd是否是P

在堆利用教程how2heap中的unsafe unlink实验中也提到了绕过unlink的这个检查的方法

绕过了这个检查，执行到else之前，数据结构示意图如下：

在这里插入图片描述

在第二次检查之前，数据结构示意图如下：

在这里插入图片描述

第二次检查的内容有：

P的size是否在large bin支持的范围内

另外还检查了一下P的fd_nextsize字段是否为空，如果为空，那么这个宏函数就结束了；如果不为空，那么P还会在所处的nextsize链表中再次进行一次unlink

于是进入第三次检查，从nextsize链表上验证数据是否合理，与之前fd-bk链表上的检查原理相同。

然后是剩下的代码：

if (FD->fd_nextsize == NULL)用来检测P->fd是否处在nextsize链表上，如果是的话，就会直接执行else内的语句，直接把P从nextsize链表上”除名“
if (P->fd_nextsize == P) 用来检测P是不是当前nextsize链表上唯一一个chunk，如果是的话，就让P->fd成为当前nextsize链表上的唯一一个chunk
剩下的这个else，就是让原本不是nextsize链表中的一员的P->fd变成nextsize链表中的一员