mysql源码解读——内存管理之底层数据库

一、内存管理

上一篇讨论的Mysql层的内存管理机制,这次讨论innodb层的内存管理。也就是说,分析一下内存和数据库引擎中的应用方式,其实从字面上都可以了解到数据库引擎需要内存怎么做?不外乎是两个硬件之间,即内存和硬盘之间如何缓冲,缓冲如何设置,缓冲的内存如何管理等等。而在内存应用中又有内存池的应用,内存的具体分配算法。这样,内存池、缓冲和具体的内存分配管理就形成了一个普遍的内存处理机制。换句话说,几乎所有的内存和硬盘之间的交互都不外乎是如此。

二、底层数据库内存的分配管理

1、内存管理
在Innodb数据引擎中,内存管理分为:重做日志缓冲(redo log buffer),额外内存池(innodb_additional_mem_pool_size),缓冲池(innodb_buffer_pool)。
而在缓冲池中主要包括:数据页(data page),索引页(index page),插入缓冲(insert buffer),自适应哈希索引(index page),数字字典等。
在内存的管理中,就一定会想到OS中对内存的处理,即页(块),而处理页表的算法,当然就是LRU之类的(在分析REDIS中的LFU也是一种)的算法了。在LRU的内存队列中,经常用到的放在队首,否则放在队尾,这样,释放时,直接操作队尾的内存即可。而这里面有一个哨兵的位置,即中间点midpoint.为什么会有这么一个中间点?说明文档上有,即为了防止全表扫描引起数据的急剧更新,而这些新数据往往并不是热点数据,反而降低了查询的速度,所以这种新查询出来的数据,要放到midpoint中,等这些数据超过一定时间后,再刷新到LRU队首。在MYSQL数据库中可以通过设置这个时间命令(set global innodb_old_blocks_time)和设置少活跃度(set global innodb_old_blocks_pct)来减少热点数据页被刷新的概率。
写过内存池或者缓冲的程序员知道,一般会有两个链表,即使用链表和空闲链表。而在MySql中使用链表又分为两个子链表即NEW链表(经常使用)和OLD链表(待释放),最初启动时,整个使用链表是空的,然后从空闲链表拿出来应用并放到使用链表中去。而这个过程中,不只有新数据页的产生,而且会产生数据的修改,也就是常说的脏数据,那么需要Flush到内存和硬盘(即在LRU和Flush链表同时存在脏数据页)。
当一个插入的动作来临时,会产生一个insert buffer,注意它不是一个缓冲,而一个数据页。每次插入到聚集索引,速度会非常快,因为通常它不需要读其它页的数据。但是在插入非聚集索引时就会有一个问题,本身非聚集索引就是一个二级索引,需要进行指针的再查询。这就产生一个效率的问题,那么就需要有一个缓存来存储一下这个索引页,这样Insert Buffer就可以先在index page的缓存页中查找一下,如果命中,就快捷多了。当然,这个缓存需要有一个方法不断的进行更新,来保证数据的有效性和命中的概率,同时因为修改还需要不断的回写到相关的硬盘文件中去。
Innodb对使用Insert Buffer有两个前置条件:索引是非聚集索引,索引不是唯一的。一定要明白。
另外上面还提到了自适应哈希索引,这个索引不是数据库的索引,它是数据库索引的索引页,看名字也知道它使用的是哈希算法,产生这种索引也需要条件的诱发:
利用主键查询或联合索引查询并达到一定数量(至少100次)。数据页使用此方法被访问次数据达到域值(数据页中的记录数*1/16),只适用等值查询有效,对范围查询无效。
换句话说,在自己开发和测试过程中,估计这个索引建立的可能性太小了。

2、内存分配
内存不论如何使用,最终仍然需要落到分配和回收上。最简单的明了的说明就是对malloc和free的封装。因为这两个函数是OS或者库提供的,没得商量。这里重点说明的是,内存的分配在MYSQL中是多线程安全的,它也应用了伙伴算法,但最终仍然是落到具体的创建方法就是一个内存的堆处理过程。
内存是一个树的数据结构,分配的大小一般是8K(最小不小于64字节),分配的数量不超过PageSize(16K)-200。更多的说明在下面的代码分析中进行展开。

三、具体的代码分析

说半天没啥感觉,还是上代码,源码之前,了无秘密:

/** A block of a memory heap consists of the info structure
followed by an area of memory */
typedef struct mem_block_info_t mem_block_t;
/** A memory heap is a nonempty linear list of memory blocks */
typedef mem_block_t mem_heap_t;


/** The info structure stored at the beginning of a heap block */
struct mem_block_info_t {
  uint64_t magic_n; /* magic number for debugging */
#ifdef UNIV_DEBUG
  char file_name[16]; /* file name where the mem heap was created */
  ulint line;         /*!< line number where the mem heap was created */
#endif                /* UNIV_DEBUG */
  UT_LIST_BASE_NODE_T(mem_block_t)
  base; /* In the first block in the
the list this is the base node of the list of blocks;
in subsequent blocks this is undefined */
  UT_LIST_NODE_T(mem_block_t)
  list;             /* This contains pointers to next
  and prev in the list. The first block allocated
  to the heap is also the first block in this list,
  though it also contains the base node of the list. */
  ulint len;        /*!< physical length of this block in bytes */
  ulint total_size; /*!< physical length in bytes of all blocks
                in the heap. This is defined only in the base
                node and is set to ULINT_UNDEFINED in others. */
  ulint type;       /*!< type of heap: MEM_HEAP_DYNAMIC, or
                    MEM_HEAP_BUF possibly ORed to MEM_HEAP_BTR_SEARCH */
  ulint free;       /*!< offset in bytes of the first free position for
                    user data in the block */
  ulint start;      /*!< the value of the struct field 'free' at the
                    creation of the block */
  void *free_block;
  /* if the MEM_HEAP_BTR_SEARCH bit is set in type,
  and this is the heap root, this can contain an
  allocated buffer frame, which can be appended as a
  free block to the heap, if we need more space;
  otherwise, this is NULL */
  void *buf_block;
  /* if this block has been allocated from the buffer
  pool, this contains the buf_block_t handle;
  otherwise, this is NULL */
};

上面的数据结构可以发现,其实所有的内存数据类型都是从mem_block_info_t重定义过去的,这个内部有一个ulint type,说明它有三种类型,即MEM_HEAP_DYNAMIC、MEM_HEAP_BUF和MEM_HEAP_BTR_SEARCH。不同的方式会有不同的内存处理方式,一定要引起重视。
它主要的操作函数如下:

/** Types of allocation for memory heaps: DYNAMIC means allocation from the
dynamic memory pool of the C compiler, BUFFER means allocation from the
buffer pool; the latter method is used for very big heaps */

#define MEM_HEAP_DYNAMIC 0 /* the most common type */
#define MEM_HEAP_BUFFER 1
#define MEM_HEAP_BTR_SEARCH            \
  2 /* this flag can optionally be     \
    ORed to MEM_HEAP_BUFFER, in which  \
    case heap->free_block is used in   \
    some cases for memory allocations, \
    and if it's NULL, the memory       \
    allocation functions can return    \
    NULL. */

/** Different type of heaps in terms of which data structure is using them */
#define MEM_HEAP_FOR_BTR_SEARCH (MEM_HEAP_BTR_SEARCH | MEM_HEAP_BUFFER)
#define MEM_HEAP_FOR_PAGE_HASH (MEM_HEAP_DYNAMIC)
#define MEM_HEAP_FOR_RECV_SYS (MEM_HEAP_BUFFER)
#define MEM_HEAP_FOR_LOCK_HEAP (MEM_HEAP_BUFFER)

/** The following start size is used for the first block in the memory heap if
the size is not specified, i.e., 0 is given as the parameter in the call of
create. The standard size is the maximum (payload) size of the blocks used for
allocations of small buffers. */

#define MEM_BLOCK_START_SIZE 64
#define MEM_BLOCK_STANDARD_SIZE \
  (UNIV_PAGE_SIZE >= 16384 ? 8000 : MEM_MAX_ALLOC_IN_BUF)

/** If a memory heap is allowed to grow into the buffer pool, the following
is the maximum size for a single allocated buffer
(from UNIV_PAGE_SIZE we subtract MEM_BLOCK_HEADER_SIZE and 2*MEM_NO_MANS_LAND
since it's something we always need to put. Since in MEM_SPACE_NEEDED we round
n to the next multiple of UNIV_MEM_ALINGMENT, we need to cut from the rest the
part that cannot be divided by UNIV_MEM_ALINGMENT): */
#define MEM_MAX_ALLOC_IN_BUF                                         \
  ((UNIV_PAGE_SIZE - MEM_BLOCK_HEADER_SIZE - 2 * MEM_NO_MANS_LAND) & \
   ~(UNIV_MEM_ALIGNMENT - 1))

/* Before and after any allocated object we will put MEM_NO_MANS_LAND bytes of
some data (different before and after) which is supposed not to be modified by
anyone. This way it would be much easier to determine whether anyone was
writing on not his memory, especially that Valgrind can assure there was no
reads or writes to this memory. */
#ifdef UNIV_DEBUG
const int MEM_NO_MANS_LAND = 16;
#else
const int MEM_NO_MANS_LAND = 0;
#endif

/* Byte that we would put before allocated object MEM_NO_MANS_LAND times.*/
const byte MEM_NO_MANS_LAND_BEFORE_BYTE = 0xCE;
/* Byte that we would put after allocated object MEM_NO_MANS_LAND times.*/
const byte MEM_NO_MANS_LAND_AFTER_BYTE = 0xDF;

/** Space needed when allocating for a user a field of length N.
The space is allocated only in multiples of UNIV_MEM_ALIGNMENT. In debug mode
contains two areas of no mans lands before and after the buffer requested. */
#define MEM_SPACE_NEEDED(N) \
  ut_calc_align(N + 2 * MEM_NO_MANS_LAND, UNIV_MEM_ALIGNMENT)

#ifdef UNIV_DEBUG
/** Macro for memory heap creation.
@param[in]	size		Desired start block size. */
#define mem_heap_create(size) \
  mem_heap_create_func((size), __FILE__, __LINE__, MEM_HEAP_DYNAMIC)

/** Macro for memory heap creation.
@param[in]	size		Desired start block size.
@param[in]	type		Heap type */
#define mem_heap_create_typed(size, type) \
  mem_heap_create_func((size), __FILE__, __LINE__, (type))

#else /* UNIV_DEBUG */
/** Macro for memory heap creation.
@param[in]	size		Desired start block size. */
#define mem_heap_create(size) mem_heap_create_func((size), MEM_HEAP_DYNAMIC)

/** Macro for memory heap creation.
@param[in]	size		Desired start block size.
@param[in]	type		Heap type */
#define mem_heap_create_typed(size, type) mem_heap_create_func((size), (type))

#endif /* UNIV_DEBUG */

/** Creates a memory heap.
NOTE: Use the corresponding macros instead of this function.
A single user buffer of 'size' will fit in the block.
0 creates a default size block.
@param[in]	size		Desired start block size. */
#ifdef UNIV_DEBUG
/**
@param[in]	file_name	File name where created
@param[in]	line		Line where created */
#endif /* UNIV_DEBUG */
/**
@param[in]	type		Heap type
@return own: memory heap, NULL if did not succeed (only possible for
MEM_HEAP_BTR_SEARCH type heaps) */
UNIV_INLINE
mem_heap_t *mem_heap_create_func(ulint size,
#ifdef UNIV_DEBUG
                                 const char *file_name, ulint line,
#endif /* UNIV_DEBUG */
                                 ulint type);

/** Frees the space occupied by a memory heap.
NOTE: Use the corresponding macro instead of this function.
@param[in]	heap	Heap to be freed */
UNIV_INLINE
void mem_heap_free(mem_heap_t *heap);

/** Allocates and zero-fills n bytes of memory from a memory heap.
@param[in]	heap	memory heap
@param[in]	n	number of bytes; if the heap is allowed to grow into
the buffer pool, this must be <= MEM_MAX_ALLOC_IN_BUF
@return allocated, zero-filled storage */
UNIV_INLINE
void *mem_heap_zalloc(mem_heap_t *heap, ulint n);

/** Allocates n bytes of memory from a memory heap.
@param[in]	heap	memory heap
@param[in]	n	number of bytes; if the heap is allowed to grow into
the buffer pool, this must be <= MEM_MAX_ALLOC_IN_BUF
@return allocated storage, NULL if did not succeed (only possible for
MEM_HEAP_BTR_SEARCH type heaps) */
UNIV_INLINE
void *mem_heap_alloc(mem_heap_t *heap, ulint n);

/** Returns a pointer to the heap top.
@param[in]	heap		memory heap
@return pointer to the heap top */
UNIV_INLINE
byte *mem_heap_get_heap_top(mem_heap_t *heap);

/** Frees the space in a memory heap exceeding the pointer given.
The pointer must have been acquired from mem_heap_get_heap_top.
The first memory block of the heap is not freed.
@param[in]	heap		heap from which to free
@param[in]	old_top		pointer to old top of heap */
UNIV_INLINE
void mem_heap_free_heap_top(mem_heap_t *heap, byte *old_top);

/** Empties a memory heap.
The first memory block of the heap is not freed.
@param[in]	heap		heap to empty */
UNIV_INLINE
void mem_heap_empty(mem_heap_t *heap);

/** Returns a pointer to the topmost element in a memory heap.
The size of the element must be given.
@param[in]	heap	memory heap
@param[in]	n	size of the topmost element
@return pointer to the topmost element */
UNIV_INLINE
void *mem_heap_get_top(mem_heap_t *heap, ulint n);

/** Checks if a given chunk of memory is the topmost element stored in the
heap. If this is the case, then calling mem_heap_free_top() would free
that element from the heap.
@param[in]	heap	memory heap
@param[in]	buf	presumed topmost element
@param[in]	buf_sz	size of buf in bytes
@return true if topmost */
UNIV_INLINE
bool mem_heap_is_top(mem_heap_t *heap, const void *buf, ulint buf_sz)
    MY_ATTRIBUTE((warn_unused_result));

/** Allocate a new chunk of memory from a memory heap, possibly discarding the
topmost element. If the memory chunk specified with (top, top_sz) is the
topmost element, then it will be discarded, otherwise it will be left untouched
and this function will be equivallent to mem_heap_alloc().
@param[in,out]	heap	memory heap
@param[in]	top	chunk to discard if possible
@param[in]	top_sz	size of top in bytes
@param[in]	new_sz	desired size of the new chunk
@return allocated storage, NULL if did not succeed (only possible for
MEM_HEAP_BTR_SEARCH type heaps) */
UNIV_INLINE
void *mem_heap_replace(mem_heap_t *heap, const void *top, ulint top_sz,
                       ulint new_sz);

/** Allocate a new chunk of memory from a memory heap, possibly discarding the
topmost element and then copy the specified data to it. If the memory chunk
specified with (top, top_sz) is the topmost element, then it will be discarded,
otherwise it will be left untouched and this function will be equivalent to
mem_heap_dup().
@param[in,out]	heap	memory heap
@param[in]	top	chunk to discard if possible
@param[in]	top_sz	size of top in bytes
@param[in]	data	new data to duplicate
@param[in]	data_sz	size of data in bytes
@return allocated storage, NULL if did not succeed (only possible for
MEM_HEAP_BTR_SEARCH type heaps) */
UNIV_INLINE
void *mem_heap_dup_replace(mem_heap_t *heap, const void *top, ulint top_sz,
                           const void *data, ulint data_sz);

/** Allocate a new chunk of memory from a memory heap, possibly discarding the
topmost element and then copy the specified string to it. If the memory chunk
specified with (top, top_sz) is the topmost element, then it will be discarded,
otherwise it will be left untouched and this function will be equivalent to
mem_heap_strdup().
@param[in,out]	heap	memory heap
@param[in]	top	chunk to discard if possible
@param[in]	top_sz	size of top in bytes
@param[in]	str	new data to duplicate
@return allocated string, NULL if did not succeed (only possible for
MEM_HEAP_BTR_SEARCH type heaps) */
UNIV_INLINE
char *mem_heap_strdup_replace(mem_heap_t *heap, const void *top, ulint top_sz,
                              const char *str);

/** Frees the topmost element in a memory heap.
@param[in]	heap	memory heap
@param[in]	n	size of the topmost element
The size of the element must be given. */
UNIV_INLINE
void mem_heap_free_top(mem_heap_t *heap, ulint n);

/** Returns the space in bytes occupied by a memory heap. */
UNIV_INLINE
ulint mem_heap_get_size(mem_heap_t *heap); /*!< in: heap */

/** Duplicates a NUL-terminated string.
@param[in]	str	string to be copied
@return own: a copy of the string, must be deallocated with ut_free */
UNIV_INLINE
char *mem_strdup(const char *str);

/** Makes a NUL-terminated copy of a nonterminated string.
@param[in]	str	string to be copied
@param[in]	len	length of str, in bytes
@return own: a copy of the string, must be deallocated with ut_free */
UNIV_INLINE
char *mem_strdupl(const char *str, ulint len);

/** Duplicates a NUL-terminated string, allocated from a memory heap.
@param[in]	heap	memory heap where string is allocated
@param[in]	str	string to be copied
@return own: a copy of the string */
char *mem_heap_strdup(mem_heap_t *heap, const char *str);

/** Makes a NUL-terminated copy of a nonterminated string, allocated from a
memory heap.
@param[in]	heap	memory heap where string is allocated
@param[in]	str	string to be copied
@param[in]	len	length of str, in bytes
@return own: a copy of the string */
UNIV_INLINE
char *mem_heap_strdupl(mem_heap_t *heap, const char *str, ulint len);

/** Concatenate two strings and return the result, using a memory heap.
 @return own: the result */
char *mem_heap_strcat(
    mem_heap_t *heap, /*!< in: memory heap where string is allocated */
    const char *s1,   /*!< in: string 1 */
    const char *s2);  /*!< in: string 2 */

/** Duplicate a block of data, allocated from a memory heap.
 @return own: a copy of the data */
void *mem_heap_dup(
    mem_heap_t *heap, /*!< in: memory heap where copy is allocated */
    const void *data, /*!< in: data to be copied */
    ulint len);       /*!< in: length of data, in bytes */

/** A simple sprintf replacement that dynamically allocates the space for the
 formatted string from the given heap. This supports a very limited set of
 the printf syntax: types 's' and 'u' and length modifier 'l' (which is
 required for the 'u' type).
 @return heap-allocated formatted string */
char *mem_heap_printf(mem_heap_t *heap,   /*!< in: memory heap */
                      const char *format, /*!< in: format string */
                      ...) MY_ATTRIBUTE((format(printf, 2, 3)));

为了给STL使用还封装了一个内存分配器,所有的具体的操作都在mem0mem.ic这个头文件中,可以好好看看:

#include "mem0mem.ic"

/** A C++ wrapper class to the mem_heap_t routines, so that it can be used
as an STL allocator */
template <typename T>
class mem_heap_allocator {
 public:
  typedef T value_type;
  typedef size_t size_type;
  typedef ptrdiff_t difference_type;
  typedef T *pointer;
  typedef const T *const_pointer;
  typedef T &reference;
  typedef const T &const_reference;

  mem_heap_allocator(mem_heap_t *heap) : m_heap(heap) {}

  mem_heap_allocator(const mem_heap_allocator &other) : m_heap(other.m_heap) {
    // Do nothing
  }

  template <typename U>
  mem_heap_allocator(const mem_heap_allocator<U> &other)
      : m_heap(other.m_heap) {
    // Do nothing
  }

  ~mem_heap_allocator() { m_heap = nullptr; }

  size_type max_size() const { return (ULONG_MAX / sizeof(T)); }

  /** This function returns a pointer to the first element of a newly
  allocated array large enough to contain n objects of type T; only the
  memory is allocated, and the objects are not constructed. Moreover,
  an optional pointer argument (that points to an object already
  allocated by mem_heap_allocator) can be used as a hint to the
  implementation about where the new memory should be allocated in
  order to improve locality. */
  pointer allocate(size_type n, const_pointer hint = nullptr) {
    return (reinterpret_cast<pointer>(mem_heap_alloc(m_heap, n * sizeof(T))));
  }

  void deallocate(pointer p, size_type n) {}

  pointer address(reference r) const { return (&r); }

  const_pointer address(const_reference r) const { return (&r); }

  void construct(pointer p, const_reference t) {
    new (reinterpret_cast<void *>(p)) T(t);
  }

  void destroy(pointer p) { (reinterpret_cast<T *>(p))->~T(); }

  /** Allocators are required to supply the below template class member
  which enables the possibility of obtaining a related allocator,
  parametrized in terms of a different type. For example, given an
  allocator type IntAllocator for objects of type int, a related
  allocator type for objects of type long could be obtained using
  IntAllocator::rebind<long>::other */
  template <typename U>
  struct rebind {
    typedef mem_heap_allocator<U> other;
  };

  /** Get the underlying memory heap object.
  @return the underlying memory heap object. */
  mem_heap_t *get_mem_heap() const { return (m_heap); }

 private:
  mem_heap_t *m_heap;
  template <typename U>
  friend class mem_heap_allocator;
};

template <class T>
bool operator==(const mem_heap_allocator<T> &left,
                const mem_heap_allocator<T> &right) {
  return (left.get_mem_heap() == right.get_mem_heap());
}

template <class T>
bool operator!=(const mem_heap_allocator<T> &left,
                const mem_heap_allocator<T> &right) {
  return (left.get_mem_heap() != right.get_mem_heap());
}

这都是最底层的内存处理机制,其实看上支和学习C/c++时的处理方式没有啥不同,可能只是更安全、更健壮。
下面再看一下页:

/** Allocates a block of memory from the heap of an index page.
 @return pointer to start of allocated buffer, or NULL if allocation fails */
byte *page_mem_alloc_heap(
    page_t *page,             /*!< in/out: index page */
    page_zip_des_t *page_zip, /*!< in/out: compressed page with enough
                             space available for inserting the record,
                             or NULL */
    ulint need,               /*!< in: total number of bytes needed */
    ulint *heap_no)           /*!< out: this contains the heap number
                              of the allocated record
                              if allocation succeeds */
{
  byte *block;
  ulint avl_space;

  ut_ad(page && heap_no);

  avl_space = page_get_max_insert_size(page, 1);

  if (avl_space >= need) {
    block = page_header_get_ptr(page, PAGE_HEAP_TOP);

    page_header_set_ptr(page, page_zip, PAGE_HEAP_TOP, block + need);
    *heap_no = page_dir_get_n_heap(page);

    page_dir_set_n_heap(page, page_zip, 1 + *heap_no);

    return (block);
  }

  return (nullptr);
}

/** Create an uncompressed B-tree or R-tree or SDI index page.
@param[in]	block		A buffer block where the page is created
@param[in]	mtr		Mini-transaction handle
@param[in]	comp		nonzero=compact page format
@param[in]	page_type	Page type
@return pointer to the page */
page_t *page_create(buf_block_t *block, mtr_t *mtr, ulint comp,
                    page_type_t page_type) {
  page_create_write_log(buf_block_get_frame(block), mtr, comp, page_type);
  return (page_create_low(block, comp, page_type));
}
/** Create a compressed B-tree index page.
@param[in,out]	block		Buffer frame where the page is created
@param[in]	index		Index of the page, or NULL when applying
                                TRUNCATE log record during recovery
@param[in]	level		The B-tree level of the page
@param[in]	max_trx_id	PAGE_MAX_TRX_ID
@param[in]	mtr		Mini-transaction handle
@param[in]	page_type	Page type to be created. Only FIL_PAGE_INDEX,
                                FIL_PAGE_RTREE, FIL_PAGE_SDI allowed
@return pointer to the page */
page_t *page_create_zip(buf_block_t *block, dict_index_t *index, ulint level,
                        trx_id_t max_trx_id, mtr_t *mtr,
                        page_type_t page_type) {
  page_t *page;
  page_zip_des_t *page_zip = buf_block_get_page_zip(block);

  ut_ad(block);
  ut_ad(page_zip);
  ut_ad(dict_table_is_comp(index->table));

#ifdef UNIV_DEBUG
  switch (page_type) {
    case FIL_PAGE_INDEX:
    case FIL_PAGE_RTREE:
    case FIL_PAGE_SDI:
      break;
    default:
      ut_ad(0);
  }
#endif /* UNIV_DEBUG */

  page = page_create_low(block, TRUE, page_type);

  mach_write_to_2(PAGE_HEADER + PAGE_LEVEL + page, level);
  mach_write_to_8(PAGE_HEADER + PAGE_MAX_TRX_ID + page, max_trx_id);

  if (!page_zip_compress(page_zip, page, index, page_zip_level, mtr)) {
    /* The compression of a newly created
    page should always succeed. */
    ut_error;
  }

  return (page);
}

在新的版本里,还引入了页压缩:

/** Populate the sparse page directory from the dense directory.
 @return true on success, false on failure */
static MY_ATTRIBUTE((warn_unused_result)) ibool page_zip_dir_decode(
    const page_zip_des_t *page_zip, /*!< in: dense page directory on
                                   compressed page */
    page_t *page,                   /*!< in: compact page with valid header;
                                    out: trailer and sparse page directory
                                    filled in */
    rec_t **recs,                   /*!< out: dense page directory sorted by
                                    ascending address (and heap_no) */
    ulint n_dense)                  /*!< in: number of user records, and
                                    size of recs[] */
{
  ulint i;
  ulint n_recs;
  byte *slot;

  n_recs = page_get_n_recs(page);

  if (UNIV_UNLIKELY(n_recs > n_dense)) {
    page_zip_fail(
        ("page_zip_dir_decode 1: %lu > %lu\n", (ulong)n_recs, (ulong)n_dense));
    return (FALSE);
  }

  /* Traverse the list of stored records in the sorting order,
  starting from the first user record. */

  slot = page + (UNIV_PAGE_SIZE - PAGE_DIR - PAGE_DIR_SLOT_SIZE);
  UNIV_PREFETCH_RW(slot);

  /* Zero out the page trailer. */
  memset(slot + PAGE_DIR_SLOT_SIZE, 0, PAGE_DIR);

  mach_write_to_2(slot, PAGE_NEW_INFIMUM);
  slot -= PAGE_DIR_SLOT_SIZE;
  UNIV_PREFETCH_RW(slot);

  /* Initialize the sparse directory and copy the dense directory. */
  for (i = 0; i < n_recs; i++) {
    ulint offs = page_zip_dir_get(page_zip, i);

    if (offs & PAGE_ZIP_DIR_SLOT_OWNED) {
      mach_write_to_2(slot, offs & PAGE_ZIP_DIR_SLOT_MASK);
      slot -= PAGE_DIR_SLOT_SIZE;
      UNIV_PREFETCH_RW(slot);
    }

    if (UNIV_UNLIKELY((offs & PAGE_ZIP_DIR_SLOT_MASK) <
                      PAGE_ZIP_START + REC_N_NEW_EXTRA_BYTES)) {
      page_zip_fail(("page_zip_dir_decode 2: %u %u %lx\n", (unsigned)i,
                     (unsigned)n_recs, (ulong)offs));
      return (FALSE);
    }

    recs[i] = page + (offs & PAGE_ZIP_DIR_SLOT_MASK);
  }

  mach_write_to_2(slot, PAGE_NEW_SUPREMUM);
  {
    const page_dir_slot_t *last_slot =
        page_dir_get_nth_slot(page, page_dir_get_n_slots(page) - 1);

    if (UNIV_UNLIKELY(slot != last_slot)) {
      page_zip_fail(("page_zip_dir_decode 3: %p != %p\n", (const void *)slot,
                     (const void *)last_slot));
      return (FALSE);
    }
  }

  /* Copy the rest of the dense directory. */
  for (; i < n_dense; i++) {
    ulint offs = page_zip_dir_get(page_zip, i);

    if (UNIV_UNLIKELY(offs & ~PAGE_ZIP_DIR_SLOT_MASK)) {
      page_zip_fail(("page_zip_dir_decode 4: %u %u %lx\n", (unsigned)i,
                     (unsigned)n_dense, (ulong)offs));
      return (FALSE);
    }

    recs[i] = page + offs;
  }

  std::sort(recs, recs + n_dense);
  return (TRUE);
}

/** Read the index information for the compressed page.
@param[in]	buf		index information
@param[in]	end		end of buf
@param[in]	trx_id_col	NULL for non-leaf pages; for leaf pages,
                                pointer to where to store the position of the
                                trx_id column
@param[in]	is_spatial	is spatial index or not
@return own: dummy index describing the page, or NULL on error */
static dict_index_t *page_zip_fields_decode(const byte *buf, const byte *end,
                                            ulint *trx_id_col,
                                            bool is_spatial) {
  const byte *b;
  ulint n;
  ulint i;
  ulint val;
  dict_table_t *table;
  dict_index_t *index;

  /* Determine the number of fields. */
  for (b = buf, n = 0; b < end; n++) {
    if (*b++ & 0x80) {
      b++; /* skip the second byte */
    }
  }

  n--; /* n_nullable or trx_id */

  if (UNIV_UNLIKELY(n > REC_MAX_N_FIELDS)) {
    page_zip_fail(("page_zip_fields_decode: n = %lu\n", (ulong)n));
    return (nullptr);
  }

  if (UNIV_UNLIKELY(b > end)) {
    page_zip_fail(("page_zip_fields_decode: %p > %p\n", (const void *)b,
                   (const void *)end));
    return (nullptr);
  }

  table = dict_mem_table_create("ZIP_DUMMY", DICT_HDR_SPACE, n, 0, 0,
                                DICT_TF_COMPACT, 0);
  index = dict_mem_index_create("ZIP_DUMMY", "ZIP_DUMMY", DICT_HDR_SPACE, 0, n);
  index->table = table;
  index->n_uniq = n;
  /* avoid ut_ad(index->cached) in dict_index_get_n_unique_in_tree */
  index->cached = TRUE;

  /* Initialize the fields. */
  for (b = buf, i = 0; i < n; i++) {
    ulint mtype;
    ulint len;

    val = *b++;

    if (UNIV_UNLIKELY(val & 0x80)) {
      /* fixed length > 62 bytes */
      val = (val & 0x7f) << 8 | *b++;
      len = val >> 1;
      mtype = DATA_FIXBINARY;
    } else if (UNIV_UNLIKELY(val >= 126)) {
      /* variable length with max > 255 bytes */
      len = 0x7fff;
      mtype = DATA_BINARY;
    } else if (val <= 1) {
      /* variable length with max <= 255 bytes */
      len = 0;
      mtype = DATA_BINARY;
    } else {
      /* fixed length < 62 bytes */
      len = val >> 1;
      mtype = DATA_FIXBINARY;
    }

    dict_mem_table_add_col(table, nullptr, nullptr, mtype,
                           val & 1 ? DATA_NOT_NULL : 0, len, true);

    /* The is_ascending flag does not matter during decompression,
    because we do not compare for "less than" or "greater than" */
    dict_index_add_col(index, table, table->get_col(i), 0, true);
  }

  val = *b++;
  if (UNIV_UNLIKELY(val & 0x80)) {
    val = (val & 0x7f) << 8 | *b++;
  }

  /* Decode the position of the trx_id column. */
  if (trx_id_col) {
    if (!val) {
      val = ULINT_UNDEFINED;
    } else if (UNIV_UNLIKELY(val >= n)) {
      page_zip_fields_free(index);
      index = nullptr;
    } else {
      index->type = DICT_CLUSTERED;
    }

    *trx_id_col = val;
  } else {
    /* Decode the number of nullable fields. */
    if (UNIV_UNLIKELY(index->n_nullable > val)) {
      page_zip_fields_free(index);
      index = nullptr;
    } else {
      index->n_nullable = val;
    }
  }

  ut_ad(b == end);

  if (is_spatial) {
    index->type |= DICT_SPATIAL;
  }

  index->n_instant_nullable = index->n_nullable;
  index->instant_cols =
      (index->is_clustered() && index->table->has_instant_cols());

  return (index);
}

更多的代码可以参看/storage/innobase/page相关源码。
再看一下buddy的相关处理代码:

/** Offset within buf_buddy_free_t where free or non_free stamps
are written.*/
#define BUF_BUDDY_STAMP_OFFSET FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID

/** Value that we stamp on all buffers that are currently on the zip_free
list. This value is stamped at BUF_BUDDY_STAMP_OFFSET offset */
#define BUF_BUDDY_STAMP_FREE dict_sys_t::s_log_space_first_id

/** Stamp value for non-free buffers. Will be overwritten by a non-zero
value by the consumer of the block */
#define BUF_BUDDY_STAMP_NONFREE 0XFFFFFFFFUL

/** Return type of buf_buddy_is_free() */
enum buf_buddy_state_t {
  BUF_BUDDY_STATE_FREE,          /*!< If the buddy to completely free */
  BUF_BUDDY_STATE_USED,          /*!< Buddy currently in used */
  BUF_BUDDY_STATE_PARTIALLY_USED /*!< Some sub-blocks in the buddy
                           are in use */
};

#ifdef UNIV_DEBUG_VALGRIND
/** Invalidate memory area that we won't access while page is free */
UNIV_INLINE
void buf_buddy_mem_invalid(buf_buddy_free_t *buf, /*!< in: block to check */
                           ulint i) /*!< in: index of zip_free[] */
{
  const size_t size = BUF_BUDDY_LOW << i;
  ut_ad(i <= BUF_BUDDY_SIZES);

  UNIV_MEM_ASSERT_W(buf, size);
  UNIV_MEM_INVALID(buf, size);
}
#else /* UNIV_DEBUG_VALGRIND */
#define buf_buddy_mem_invalid(buf, i) ut_ad((i) <= BUF_BUDDY_SIZES)
#endif /* UNIV_DEBUG_VALGRIND */

/** Check if a buddy is stamped free.
 @return whether the buddy is free */
UNIV_INLINE MY_ATTRIBUTE((warn_unused_result)) bool buf_buddy_stamp_is_free(
    const buf_buddy_free_t *buf) /*!< in: block to check */
{
  return (mach_read_from_4(buf->stamp.bytes + BUF_BUDDY_STAMP_OFFSET) ==
          BUF_BUDDY_STAMP_FREE);
}

/** Stamps a buddy free. */
UNIV_INLINE
void buf_buddy_stamp_free(buf_buddy_free_t *buf, /*!< in/out: block to stamp */
                          ulint i)               /*!< in: block size */
{
  ut_d(memset(&buf->stamp, static_cast<int>(i), BUF_BUDDY_LOW << i));
  buf_buddy_mem_invalid(buf, i);
  mach_write_to_4(buf->stamp.bytes + BUF_BUDDY_STAMP_OFFSET,
                  BUF_BUDDY_STAMP_FREE);
  buf->stamp.size = i;
}

/** Stamps a buddy nonfree.
 @param[in,out]	buf	block to stamp
 @param[in]	i	block size */
#define buf_buddy_stamp_nonfree(buf, i)                         \
  do {                                                          \
    buf_buddy_mem_invalid(buf, i);                              \
    memset(buf->stamp.bytes + BUF_BUDDY_STAMP_OFFSET, 0xff, 4); \
  } while (0)
#if BUF_BUDDY_STAMP_NONFREE != 0xffffffff
#error "BUF_BUDDY_STAMP_NONFREE != 0xffffffff"
#endif

/** Get the offset of the buddy of a compressed page frame.
 @return the buddy relative of page */
UNIV_INLINE
void *buf_buddy_get(byte *page, /*!< in: compressed page */
                    ulint size) /*!< in: page size in bytes */
{
  ut_ad(ut_is_2pow(size));
  ut_ad(size >= BUF_BUDDY_LOW);
  ut_ad(BUF_BUDDY_LOW <= UNIV_ZIP_SIZE_MIN);
  ut_ad(size < BUF_BUDDY_HIGH);
  ut_ad(BUF_BUDDY_HIGH == UNIV_PAGE_SIZE);
  ut_ad(!ut_align_offset(page, size));

  if (((ulint)page) & size) {
    return (page - size);
  } else {
    return (page + size);
  }
}

#ifdef UNIV_DEBUG
/** Validate a given zip_free list. */
struct CheckZipFree {
  CheckZipFree(ulint i) : m_i(i) {}

  void operator()(const buf_buddy_free_t *elem) const {
    ut_a(buf_buddy_stamp_is_free(elem));
    ut_a(elem->stamp.size <= m_i);
  }

  ulint m_i;
};

/** Validate a buddy list.
@param[in]	buf_pool	buffer pool instance
@param[in]	i		buddy size to validate */
static void buf_buddy_list_validate(const buf_pool_t *buf_pool, ulint i) {
  CheckZipFree check(i);
  ut_ad(mutex_own(&buf_pool->zip_free_mutex));
  ut_list_validate(buf_pool->zip_free[i], check);
}

/** Debug function to validate that a buffer is indeed free i.e.: in the
zip_free[].
@param[in]	buf_pool	buffer pool instance
@param[in]	buf		block to check
@param[in]	i		index of buf_pool->zip_free[]
@return true if free */
UNIV_INLINE
bool buf_buddy_check_free(buf_pool_t *buf_pool, const buf_buddy_free_t *buf,
                          ulint i) {
  const ulint size = BUF_BUDDY_LOW << i;

  ut_ad(mutex_own(&buf_pool->zip_free_mutex));
  ut_ad(!ut_align_offset(buf, size));
  ut_ad(i >= buf_buddy_get_slot(UNIV_ZIP_SIZE_MIN));

  buf_buddy_free_t *itr;

  for (itr = UT_LIST_GET_FIRST(buf_pool->zip_free[i]); itr && itr != buf;
       itr = UT_LIST_GET_NEXT(list, itr)) {
  }

  return (itr == buf);
}
#endif /* UNIV_DEBUG */

/** Checks if a buf is free i.e.: in the zip_free[].
 @retval BUF_BUDDY_STATE_FREE if fully free
 @retval BUF_BUDDY_STATE_USED if currently in use
 @retval BUF_BUDDY_STATE_PARTIALLY_USED if partially in use. */
static MY_ATTRIBUTE((warn_unused_result)) buf_buddy_state_t
    buf_buddy_is_free(buf_buddy_free_t *buf, /*!< in: block to check */
                      ulint i)               /*!< in: index of
                                             buf_pool->zip_free[] */
{
#ifdef UNIV_DEBUG
  const ulint size = BUF_BUDDY_LOW << i;
  ut_ad(!ut_align_offset(buf, size));
  ut_ad(i >= buf_buddy_get_slot(UNIV_ZIP_SIZE_MIN));
#endif /* UNIV_DEBUG */

  /* We assume that all memory from buf_buddy_alloc()
  is used for compressed page frames. */

  /* We look inside the allocated objects returned by
  buf_buddy_alloc() and assume that each block is a compressed
  page that contains one of the following in space_id.
  * BUF_BUDDY_STAMP_FREE if the block is in a zip_free list or
  * BUF_BUDDY_STAMP_NONFREE if the block has been allocated but
  not initialized yet or
  * A valid space_id of a compressed tablespace

  The call below attempts to read from free memory.  The memory
  is "owned" by the buddy allocator (and it has been allocated
  from the buffer pool), so there is nothing wrong about this. */
  if (!buf_buddy_stamp_is_free(buf)) {
    return (BUF_BUDDY_STATE_USED);
  }

  /* A block may be free but a fragment of it may still be in use.
  To guard against that we write the free block size in terms of
  zip_free index at start of stamped block. Note that we can
  safely rely on this value only if the buf is free. */
  ut_ad(buf->stamp.size <= i);
  return (buf->stamp.size == i ? BUF_BUDDY_STATE_FREE
                               : BUF_BUDDY_STATE_PARTIALLY_USED);
}

/** Add a block to the head of the appropriate buddy free list.
@param[in]	buf_pool	buffer pool instance
@param[in,out]	buf		block to be freed
@param[in]	i		index of buf_pool->zip_free[] */
UNIV_INLINE
void buf_buddy_add_to_free(buf_pool_t *buf_pool, buf_buddy_free_t *buf,
                           ulint i) {
  ut_ad(mutex_own(&buf_pool->zip_free_mutex));
  ut_ad(buf_pool->zip_free[i].start != buf);

  buf_buddy_stamp_free(buf, i);
  UT_LIST_ADD_FIRST(buf_pool->zip_free[i], buf);
  ut_d(buf_buddy_list_validate(buf_pool, i));
}

/** Remove a block from the appropriate buddy free list.
@param[in]	buf_pool	buffer pool instance
@param[in,out]	buf		block to be freed
@param[in]	i		index of buf_pool->zip_free[] */
UNIV_INLINE
void buf_buddy_remove_from_free(buf_pool_t *buf_pool, buf_buddy_free_t *buf,
                                ulint i) {
  ut_ad(mutex_own(&buf_pool->zip_free_mutex));
  ut_ad(buf_buddy_check_free(buf_pool, buf, i));

  UT_LIST_REMOVE(buf_pool->zip_free[i], buf);
  buf_buddy_stamp_nonfree(buf, i);
}

/** Try to allocate a block from buf_pool->zip_free[].
@param[in]	buf_pool	buffer pool instance
@param[in]	i		index of buf_pool->zip_free[]
@return allocated block, or NULL if buf_pool->zip_free[] was empty */
static buf_buddy_free_t *buf_buddy_alloc_zip(buf_pool_t *buf_pool, ulint i) {
  buf_buddy_free_t *buf;

  ut_a(i < BUF_BUDDY_SIZES);
  ut_a(i >= buf_buddy_get_slot(UNIV_ZIP_SIZE_MIN));

  mutex_enter(&buf_pool->zip_free_mutex);
  ut_d(buf_buddy_list_validate(buf_pool, i));

  buf = UT_LIST_GET_FIRST(buf_pool->zip_free[i]);

  if (buf_get_withdraw_depth(buf_pool)) {
    while (buf != nullptr &&
           buf_frame_will_withdrawn(buf_pool, reinterpret_cast<byte *>(buf))) {
      /* This should be withdrawn, not to be allocated */
      buf = UT_LIST_GET_NEXT(list, buf);
    }
  }

  if (buf) {
    buf_buddy_remove_from_free(buf_pool, buf, i);
    mutex_exit(&buf_pool->zip_free_mutex);

  } else if (i + 1 < BUF_BUDDY_SIZES) {
    mutex_exit(&buf_pool->zip_free_mutex);
    /* Attempt to split. */
    buf = buf_buddy_alloc_zip(buf_pool, i + 1);

    if (buf) {
      byte *allocated_block = buf->stamp.bytes;
      buf_buddy_free_t *buddy = reinterpret_cast<buf_buddy_free_t *>(
          allocated_block + (BUF_BUDDY_LOW << i));

      mutex_enter(&buf_pool->zip_free_mutex);
      ut_ad(!buf_pool_contains_zip(buf_pool, buddy));
      buf_buddy_add_to_free(buf_pool, buddy, i);
      mutex_exit(&buf_pool->zip_free_mutex);
    }
  } else {
    mutex_exit(&buf_pool->zip_free_mutex);
  }

  if (buf) {
    /* Trash the page other than the BUF_BUDDY_STAMP_NONFREE. */
    UNIV_MEM_TRASH(buf, ~i, BUF_BUDDY_STAMP_OFFSET);
    UNIV_MEM_TRASH(BUF_BUDDY_STAMP_OFFSET + 4 + buf->stamp.bytes, ~i,
                   (BUF_BUDDY_LOW << i) - (BUF_BUDDY_STAMP_OFFSET + 4));
    ut_ad(mach_read_from_4(buf->stamp.bytes + BUF_BUDDY_STAMP_OFFSET) ==
          BUF_BUDDY_STAMP_NONFREE);
  }

  return (buf);
}

/** Deallocate a buffer frame of UNIV_PAGE_SIZE.
@param[in]	buf_pool	buffer pool instance
@param[in]	buf		buffer frame to deallocate */
static void buf_buddy_block_free(buf_pool_t *buf_pool, void *buf) {
  const ulint fold = BUF_POOL_ZIP_FOLD_PTR(buf);
  buf_page_t *bpage;

  ut_ad(!mutex_own(&buf_pool->zip_mutex));
  ut_a(!ut_align_offset(buf, UNIV_PAGE_SIZE));

  mutex_enter(&buf_pool->zip_hash_mutex);

  HASH_SEARCH(hash, buf_pool->zip_hash, fold, buf_page_t *, bpage,
              ut_ad(buf_page_get_state(bpage) == BUF_BLOCK_MEMORY &&
                    bpage->in_zip_hash && !bpage->in_page_hash),
              ((buf_block_t *)bpage)->frame == buf);
  ut_a(bpage);
  ut_a(buf_page_get_state(bpage) == BUF_BLOCK_MEMORY);
  ut_ad(!bpage->in_page_hash);
  ut_ad(bpage->in_zip_hash);
  ut_d(bpage->in_zip_hash = FALSE);
  HASH_DELETE(buf_page_t, hash, buf_pool->zip_hash, fold, bpage);

  ut_ad(buf_pool->buddy_n_frames > 0);
  ut_d(buf_pool->buddy_n_frames--);

  mutex_exit(&buf_pool->zip_hash_mutex);

  ut_d(memset(buf, 0, UNIV_PAGE_SIZE));
  UNIV_MEM_INVALID(buf, UNIV_PAGE_SIZE);

  buf_LRU_block_free_non_file_page(reinterpret_cast<buf_block_t *>(bpage));
}

/** Allocate a buffer block to the buddy allocator.
@param[in]	block	buffer frame to allocate */
static void buf_buddy_block_register(buf_block_t *block) {
  buf_pool_t *buf_pool = buf_pool_from_block(block);
  const ulint fold = BUF_POOL_ZIP_FOLD(block);
  ut_ad(!mutex_own(&buf_pool->zip_mutex));
  ut_ad(buf_block_get_state(block) == BUF_BLOCK_READY_FOR_USE);

  buf_block_set_state(block, BUF_BLOCK_MEMORY);

  ut_a(block->frame);
  ut_a(!ut_align_offset(block->frame, UNIV_PAGE_SIZE));

  ut_ad(!block->page.in_page_hash);
  ut_ad(!block->page.in_zip_hash);
  ut_d(block->page.in_zip_hash = TRUE);

  mutex_enter(&buf_pool->zip_hash_mutex);
  HASH_INSERT(buf_page_t, hash, buf_pool->zip_hash, fold, &block->page);

  ut_d(buf_pool->buddy_n_frames++);
  mutex_exit(&buf_pool->zip_hash_mutex);
}

/** Allocate a block from a bigger object.
@param[in]	buf_pool	buffer pool instance
@param[in]	buf		a block that is free to use
@param[in]	i		index of buf_pool->zip_free[]
@param[in]	j		size of buf as an index of buf_pool->zip_free[]
@return allocated block */
static void *buf_buddy_alloc_from(buf_pool_t *buf_pool, void *buf, ulint i,
                                  ulint j) {
  ulint offs = BUF_BUDDY_LOW << j;
  ut_ad(mutex_own(&buf_pool->zip_free_mutex));
  ut_ad(j <= BUF_BUDDY_SIZES);
  ut_ad(i >= buf_buddy_get_slot(UNIV_ZIP_SIZE_MIN));
  ut_ad(j >= i);
  ut_ad(!ut_align_offset(buf, offs));

  /* Add the unused parts of the block to the free lists. */
  while (j > i) {
    buf_buddy_free_t *zip_buf;

    offs >>= 1;
    j--;

    zip_buf = reinterpret_cast<buf_buddy_free_t *>(
        reinterpret_cast<byte *>(buf) + offs);
    buf_buddy_add_to_free(buf_pool, zip_buf, j);
  }

  buf_buddy_stamp_nonfree(reinterpret_cast<buf_buddy_free_t *>(buf), i);
  return (buf);
}

/** Allocate a block.
@param[in,out]	buf_pool	buffer pool instance
@param[in]	i		index of buf_pool->zip_free[]
                                or BUF_BUDDY_SIZES
@return allocated block, never NULL */
void *buf_buddy_alloc_low(buf_pool_t *buf_pool, ulint i) {
  buf_block_t *block;

  ut_ad(!mutex_own(&buf_pool->zip_mutex));
  ut_ad(i >= buf_buddy_get_slot(UNIV_ZIP_SIZE_MIN));

  if (i < BUF_BUDDY_SIZES) {
    /* Try to allocate from the buddy system. */
    block = (buf_block_t *)buf_buddy_alloc_zip(buf_pool, i);

    if (block) {
      goto func_exit;
    }
  }

  /* Try allocating from the buf_pool->free list. */
  block = buf_LRU_get_free_only(buf_pool);

  if (block) {
    goto alloc_big;
  }

  /* Try replacing an uncompressed page in the buffer pool. */
  block = buf_LRU_get_free_block(buf_pool);

alloc_big:
  buf_buddy_block_register(block);

  mutex_enter(&buf_pool->zip_free_mutex);
  block = (buf_block_t *)buf_buddy_alloc_from(buf_pool, block->frame, i,
                                              BUF_BUDDY_SIZES);
  mutex_exit(&buf_pool->zip_free_mutex);

func_exit:
  buf_pool->buddy_stat[i].used.fetch_add(1);
  return (block);
}

/** Try to relocate a block. The caller must hold zip_free_mutex, and this
function will release and lock it again.
@param[in]	buf_pool	buffer pool instance
@param[in]	src		block to relocate
@param[in]	dst		free block to relocated to
@param[in]	i		index of buf_pool->zip_free[]
@param[in]	force		true if we must relocated always
@return true if relocated */
static bool buf_buddy_relocate(buf_pool_t *buf_pool, void *src, void *dst,
                               ulint i, bool force) {
  buf_page_t *bpage;
  const ulint size = BUF_BUDDY_LOW << i;
  space_id_t space;
  page_no_t offset;

  ut_ad(mutex_own(&buf_pool->zip_free_mutex));
  ut_ad(!mutex_own(&buf_pool->zip_mutex));
  ut_ad(!ut_align_offset(src, size));
  ut_ad(!ut_align_offset(dst, size));
  ut_ad(i >= buf_buddy_get_slot(UNIV_ZIP_SIZE_MIN));
  UNIV_MEM_ASSERT_W(dst, size);

  space =
      mach_read_from_4((const byte *)src + FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID);
  offset = mach_read_from_4((const byte *)src + FIL_PAGE_OFFSET);

  /* Suppress Valgrind warnings about conditional jump
  on uninitialized value. */
  UNIV_MEM_VALID(&space, sizeof space);
  UNIV_MEM_VALID(&offset, sizeof offset);

  ut_ad(space != BUF_BUDDY_STAMP_FREE);

  const page_id_t page_id(space, offset);

  /* If space,offset is bogus, then we know that the
  buf_page_hash_get_low() call below will return NULL. */
  if (!force && buf_pool != buf_pool_get(page_id)) {
    return (false);
  }

  mutex_exit(&buf_pool->zip_free_mutex);

  rw_lock_t *hash_lock = buf_page_hash_lock_get(buf_pool, page_id);

  rw_lock_x_lock(hash_lock);

  /* page_hash can be changed. */
  hash_lock = buf_page_hash_lock_x_confirm(hash_lock, buf_pool, page_id);

  bpage = buf_page_hash_get_low(buf_pool, page_id);

  if (!bpage || bpage->zip.data != src) {
    /* The block has probably been freshly
    allocated by buf_LRU_get_free_block() but not
    added to buf_pool->page_hash yet.  Obviously,
    it cannot be relocated. */

    rw_lock_x_unlock(hash_lock);

    if (!force || space != 0 || offset != 0) {
      mutex_enter(&buf_pool->zip_free_mutex);
      return (false);
    }

    /* It might be just uninitialized page.
    We should search from LRU list also. */

    /* force is true only when buffer pool resizing,
    in which we hold LRU_list_mutex already, see
    buf_pool_withdraw_blocks(). */
    ut_ad(force);
    ut_ad(mutex_own(&buf_pool->LRU_list_mutex));

    bpage = UT_LIST_GET_FIRST(buf_pool->LRU);
    while (bpage != nullptr) {
      if (bpage->zip.data == src) {
        hash_lock = buf_page_hash_lock_get(buf_pool, bpage->id);
        rw_lock_x_lock(hash_lock);
        break;
      }
      bpage = UT_LIST_GET_NEXT(LRU, bpage);
    }

    if (bpage == nullptr) {
      mutex_enter(&buf_pool->zip_free_mutex);
      return (false);
    }
  }

  if (page_zip_get_size(&bpage->zip) != size) {
    /* The block is of different size.  We would
    have to relocate all blocks covered by src.
    For the sake of simplicity, give up. */
    ut_ad(page_zip_get_size(&bpage->zip) < size);

    rw_lock_x_unlock(hash_lock);

    mutex_enter(&buf_pool->zip_free_mutex);
    return (false);
  }

  /* The block must have been allocated, but it may
  contain uninitialized data. */
  UNIV_MEM_ASSERT_W(src, size);

  BPageMutex *block_mutex = buf_page_get_mutex(bpage);

  mutex_enter(block_mutex);

  mutex_enter(&buf_pool->zip_free_mutex);

  if (buf_page_can_relocate(bpage)) {
    /* Relocate the compressed page. */
    const auto usec = ut_time_monotonic_us();

    ut_a(bpage->zip.data == src);

    memcpy(dst, src, size);
    bpage->zip.data = reinterpret_cast<page_zip_t *>(dst);

    rw_lock_x_unlock(hash_lock);

    mutex_exit(block_mutex);

    buf_buddy_mem_invalid(reinterpret_cast<buf_buddy_free_t *>(src), i);

    buf_buddy_stat_t *buddy_stat = &buf_pool->buddy_stat[i];
    buddy_stat->relocated++;
    buddy_stat->relocated_usec += ut_time_monotonic_us() - usec;
    return (true);
  }

  rw_lock_x_unlock(hash_lock);

  mutex_exit(block_mutex);
  return (false);
}

如果对内存分配的算法有比较多的经验的人很容易就看明白。伙伴算法其实就是用2的幂分配内存和不断的内存合并来处理算法碎片,当然他还是有缺点的,一个较小的内存可能会阻止合并的发生。同时,内存有一定浪费。最后内存的封装在storage/heap中,里面的代码好多,这里只给一个简单的:

void hp_free(HP_SHARE *share) {
  bool not_internal_table = (share->open_list.data != nullptr);
  if (not_internal_table) /* If not internal table */
    heap_share_list = list_delete(heap_share_list, &share->open_list);
  hp_clear(share); /* Remove blocks from memory */
  if (not_internal_table) thr_lock_delete(&share->lock);
  my_free(share->name);
  my_free(share);
  return;
}
void my_free(void *ptr) {
  my_memory_header *mh;

  if (ptr == nullptr) return;

  mh = USER_TO_HEADER(ptr);
  assert(mh->m_magic == MAGIC);
  PSI_MEMORY_CALL(memory_free)(mh->m_key, mh->m_size, mh->m_owner);
  /* Catch double free */
  mh->m_magic = 0xDEAD;
  MEM_FREELIKE_BLOCK(ptr, 0);
  my_raw_free(mh);
}
static void my_raw_free(void *ptr) {
#if defined(MY_MSCRT_DEBUG)
  _free_dbg(ptr, _CLIENT_BLOCK);
#else
  free(ptr);
#endif
}

更多的信息可以参看:

https://dev.mysql.com/doc/dev/mysql-server/latest/PAGE_INNODB_REDO_LOG.html
https://dev.mysql.com/doc/dev/mysql-server/latest/ha__innodb_8cc.html#ab53b49513cbb025c61c46914aca92771
https://dev.mysql.com/doc/dev/mysql-server/latest/structexport__var__t.html#a13d7a14b10fc9aff7e99ecfcb29c1171

MySql的文档写得还是挺清晰的,大家可以多看看。

四、总结

内存管理看来是任何软件都绕不过去的一个坎儿,怎么样设计一个万能的内存管理机制,目前看是不可能实现了,但是这不代表针对某一个具体的方向或者一个软件框架上是不可能的。所以,多多学习别人的内存处理的机制和方法,明白适用的场景和为什么要这么做,从中汲取对自己有利的思想并在未来的实践中能不断的应用和提高,这就是看源码,分析源码的道理所在.
努力吧,归来的少年!
在这里插入图片描述

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 4
    评论
评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值