mysql源码解读——内存管理MEM_ROOT

最新推荐文章于 2025-03-30 17:42:47 发布

fpcc

最新推荐文章于 2025-03-30 17:42:47 发布

阅读量1k

点赞数 1

分类专栏：数据库开发文章标签： mysql

本文链接：https://blog.csdn.net/fpcc/article/details/118269774

版权

数据库开发专栏收录该内容

48 篇文章

订阅专栏

一、内存管理

这个实在是没办法多说了，就当是沿袭所有框架的做法，自己搞一下内存管理，这样才高大上一样。MEM_ROOT定义在my_alloc.h（include文件夹）。其实内存管理最简单方便的就是统一分配，集中回收，动态调整。话说起来容易，做起来难啊。大牛们哪个不清丝明了的知道，可写一个适配大多数的场景下的这种内存管理代码是极其难的。不然，内存管理也不会上升到一个又一个算法推出的地步。
空间和时间的随时变化和内存资源的有限性，不同场合对内存调度的不确定性，都严重影响着编写内存管理者的设计思想。既要保证自己程序运行时内存的足够，又保证运行环境本身内存的安全量足够；既要给自己程序使用内存预留，又要给缓存预留；既要保证内存够用，又不能一次分配过大，导致浪费，又能保持内存动态的不断的分配扩容的安全性。
螺蛳壳里做道场，能不能搞好，这就是大牛们的本事。
这里不讨论innodb层的内存管理，只分析上层MEM_ROOT的内存管理机制。

二、MySql中内存的分配机制

MySql中的内存的分配机制也基本没有跳出常见的内存管理的窠臼，集中申请，统一管理和回收。这样一方面减少了内存分配的过程提高了效率，另外一个可以尽量减少内存的碎片和内存的再利用，同样也提高了内存使用的效率。两下同时进行，自然就提高了Mysql的性能。
在上层，使用MEM_ROOT做为内存管理的数据结构，它是线程兼容的，但不是线程安全的，一定要注意。可见做到线程安全，还要有相当的效率，是多么的困难和不容易。
另外，为了使用MEM_ROOT还提供了一个USED_MEM这个数据结构体，在下面会一并分析。
在注释中提到了，内存的分配一般是以50%的策略扩张的。需要引起重视。

三、具体的代码分析

看一下应用内存的数据结构：

//include/my_sys.h
//一次性分配结构
struct USED_MEM {
  USED_MEM *next;    /**< Next block in use */
  unsigned int left; /**< memory left in block  */
  unsigned int size; /**< size of block */
};

这个数据结构相当简单，只是一个自身的指针以及块大小和余下的容量。写过链表的都知道第一个指向自己指针的意义，不外乎是头或者前后的指针，也就是说，它应该是指向一个链表的。
再看一下MEM_ROOT的定义：

//include/my_alloc.h
struct MEM_ROOT {
 private:
  struct Block {
    Block *prev{nullptr}; /** Previous block; used for freeing. */
  };

 public:
  MEM_ROOT() : MEM_ROOT(0, 512) {}  // 0 = PSI_NOT_INSTRUMENTED.

  MEM_ROOT(PSI_memory_key key, size_t block_size)
      : m_block_size(block_size),
        m_orig_block_size(block_size),
        m_psi_key(key) {}

  // MEM_ROOT is movable but not copyable.
  MEM_ROOT(const MEM_ROOT &) = delete;
  MEM_ROOT(MEM_ROOT &&other)
  noexcept
      : m_current_block(other.m_current_block),
        m_current_free_start(other.m_current_free_start),
        m_current_free_end(other.m_current_free_end),
        m_block_size(other.m_block_size),
        m_orig_block_size(other.m_orig_block_size),
        m_max_capacity(other.m_max_capacity),
        m_allocated_size(other.m_allocated_size),
        m_error_for_capacity_exceeded(other.m_error_for_capacity_exceeded),
        m_error_handler(other.m_error_handler),
        m_psi_key(other.m_psi_key) {
    other.m_current_block = nullptr;
    other.m_allocated_size = 0;
    other.m_block_size = m_orig_block_size;
    other.m_current_free_start = &s_dummy_target;
    other.m_current_free_end = &s_dummy_target;
  }

  MEM_ROOT &operator=(const MEM_ROOT &) = delete;
  MEM_ROOT &operator=(MEM_ROOT &&other) noexcept {
    Clear();
    ::new (this) MEM_ROOT(std::move(other));
    return *this;
  }

  ~MEM_ROOT() { Clear(); }

  /**
   * Allocate memory. Will return nullptr if there's not enough memory,
   * or if the maximum capacity is reached.
   *
   * Note that a zero-length allocation can return _any_ pointer, including
   * nullptr or a pointer that has been given out before. The current
   * implementation takes some pains to make sure we never return nullptr
   * (although it might return a bogus pointer), since there is code that
   * assumes nullptr always means “out of memory”, but you should not rely on
   * it, as it may change in the future.
   *
   * The returned pointer will always be 8-aligned.
   */
   //分配内存，返回的为8字节对齐，不足或达到最大将返回nullptr。要保证内存的足够。
  void *Alloc(size_t length) MY_ATTRIBUTE((malloc)) {
    length = ALIGN_SIZE(length);

    // Skip the straight path if simulating OOM; it should always fail.
    DBUG_EXECUTE_IF("simulate_out_of_memory", return AllocSlow(length););

    // Fast path, used in the majority of cases. It would be faster here
    // (saving one register due to CSE) to instead test
    //
    //   m_current_free_start + length <= m_current_free_end
    //
    // but it would invoke undefined behavior, and in particular be prone
    // to wraparound on 32-bit platforms.
    //使用空闲内存
    if (static_cast<size_t>(m_current_free_end - m_current_free_start) >=
        length) {
      void *ret = m_current_free_start;
      m_current_free_start += length;
      return ret;
    }

    return AllocSlow(length);
  }

  /**
    Allocate “num” objects of type T, and default-construct them.
    If the constructor throws an exception, behavior is undefined.

    We don't use new[], as it can put extra data in front of the array.
   */
   //分配数组，不使用NEW的原因是有额外开销
  template <class T, class... Args>
  T *ArrayAlloc(size_t num, Args &&... args) {
    static_assert(alignof(T) <= 8, "MEM_ROOT only returns 8-aligned memory.");
    if (num * sizeof(T) < num) {
      // Overflow.
      return nullptr;
    }
    T *ret = static_cast<T *>(Alloc(num * sizeof(T)));
    if (ret == nullptr) {
      // Out of memory.
      return nullptr;
    }

    // Construct all elements. For primitive types like int
    // and no arguments (ie., default construction),
    // the entire loop will be optimized away.
    //复用完美转发，实现元素的构造和优化。
    for (size_t i = 0; i < num; ++i) {
      new (&ret[i]) T(std::forward<Args>(args)...);
    }

    return ret;
  }

  /**
   * Claim all the allocated memory for the current thread in the performance
   * schema. Use when transferring responsibility for a MEM_ROOT from one thread
   * to another.
   */
   //MEM_ROOT内存在线程间操作
  void Claim(bool claim);

  /**
   * Deallocate all the RAM used. The MEM_ROOT itself continues to be valid,
   * so you can make new calls to Alloc() afterwards.

   * @note
   *   One can call this function either with a MEM_ROOT initialized with the
   *   constructor, or with one that's memset() to all zeros.
   *   It's also safe to call this multiple times with the same mem_root.
   */
  void Clear();

  /**
   * Similar to Clear(), but anticipates that the block will be reused for
   * further allocations. This means that even though all the data is gone,
   * one memory block (typically the largest allocated) will be kept and
   * made immediately available for calls to Alloc() without having to go to the
   * OS for new memory. This can yield performance gains if you use the same
   * MEM_ROOT many times. Also, the block size is not reset.
   */
  void ClearForReuse();

  /**
    Whether the constructor has run or not.

    This exists solely to support legacy code that memset()s the MEM_ROOT to
    all zeros, which wants to distinguish between that state and a properly
    initialized MEM_ROOT. If you do not run the constructor _nor_ do memset(),
    you are invoking undefined behavior.
  */
  //构造函数是否调用
  bool inited() const { return m_block_size != 0; }

  /**
   * Set maximum capacity for this MEM_ROOT. Whenever the MEM_ROOT has
   * allocated more than this (not including overhead), and the free block
   * is empty, future allocations will fail.
   *
   * @param max_capacity        Maximum capacity this mem_root can hold
   */
  void set_max_capacity(size_t max_capacity) { m_max_capacity = max_capacity; }

  /**
   * Return maximum capacity for this MEM_ROOT.
   */
  size_t get_max_capacity() const { return m_max_capacity; }

  /**
   * Enable/disable error reporting for exceeding the maximum capacity.
   * If error reporting is enabled, an error is flagged to indicate that the
   * capacity is exceeded. However, allocation will still happen for the
   * requested memory.
   *
   * @param report_error    whether the error should be reported
   */
  void set_error_for_capacity_exceeded(bool report_error) {
    m_error_for_capacity_exceeded = report_error;
  }

  /**
   * Return whether error is to be reported when
   * maximum capacity exceeds for MEM_ROOT.
   */
  bool get_error_for_capacity_exceeded() const {
    return m_error_for_capacity_exceeded;
  }

  /**
   * Set the error handler on memory allocation failure (or nullptr for none).
   * The error handler is called called whenever my_malloc() failed to allocate
   * more memory from the OS (which causes my_alloc() to return nullptr).
   */
  void set_error_handler(void (*error_handler)(void)) {
    m_error_handler = error_handler;
  }

  /**
   * Amount of memory we have allocated from the operating system, not including
   * overhead.
   */
  size_t allocated_size() const { return m_allocated_size; }

  /**
   * Set the desired size of the next block to be allocated. Note that future
   * allocations
   * will grow in size over this, although a Clear() will reset the size again.
   */
  void set_block_size(size_t block_size) {
    m_block_size = m_orig_block_size = block_size;
  }

  /**
   * @name Raw interface
   * Peek(), ForceNewBlock() and RawCommit() together define an
   * alternative interface to MEM_ROOT, for special uses. The raw interface
   * gives direct access to the underlying blocks, allowing a user to bypass the
   * normal alignment requirements and to write data directly into them without
   * knowing beforehand exactly how long said data is going to be, while still
   * retaining the convenience of block management and automatic freeing. It
   * generally cannot be combined with calling Alloc() as normal; see RawCommit.
   *
   * The raw interface, unlike Alloc(), is not affected by running under
   * ASan or Valgrind.
   *
   * @{
   */

  /**
   * Get the bounds of the currently allocated memory block. Assuming no other
   * MEM_ROOT calls are made in the meantime, you can start writing into this
   * block and then call RawCommit() once you know how many bytes you actually
   * needed. (This is useful when e.g. packing rows.)
   */
  std::pair<char *, char *> Peek() const {
    return {m_current_free_start, m_current_free_end};
  }

  /**
   * Allocate a new block of at least “minimum_length” bytes; usually more.
   * This holds no matter how many bytes are free in the current block.
   * The new black will always become the current block, ie., the next call
   * to Peek() will return the newlyy allocated block. (This is different
   * from Alloc(), where it is possible to allocate a new block that is
   * not made into the current block.)
   *
   * @return true Allocation failed (possibly due to size restrictions).
   */
  bool ForceNewBlock(size_t minimum_length);

  /**
   * Mark the first N bytes as the current block as used.
   *
   * WARNING: If you use RawCommit() with a length that is not a multiple of 8,
   * you cannot use Alloc() afterwards! The exception is that if EnsureSpace()
   * has just returned, you've got a new block, and can use Alloc() again.
   */
  void RawCommit(size_t length) {
    assert(static_cast<size_t>(m_current_free_end - m_current_free_start) >=
           length);
    m_current_free_start += length;
  }

  /// @}

 private:
  /**
   * Something to point on that exists solely to never return nullptr
   * from Alloc(0).
   */
  static char s_dummy_target;

  /**
    Allocate a new block of the given length (plus overhead for the block
    header). If the MEM_ROOT is near capacity, it may allocate less memory
    than wanted_length, but if it cannot allocate at least minimum_length,
    will return nullptr.
  */
  std::pair<Block *, size_t> AllocBlock(size_t wanted_length,
                                        size_t minimum_length);

  /** Allocate memory that doesn't fit into the current free block. */
  void *AllocSlow(size_t length);

  /** Free all blocks in a linked list, starting at the given block. */
  static void FreeBlocks(Block *start);

  /** The current block we are giving out memory from. nullptr if none. */
  Block *m_current_block = nullptr;

  /** Start (inclusive) of the current free block. */
  char *m_current_free_start = &s_dummy_target;

  /** End (exclusive) of the current free block. */
  char *m_current_free_end = &s_dummy_target;

  /** Size of the _next_ block we intend to allocate. */
  size_t m_block_size;

  /** The original block size the user asked for on construction. */
  size_t m_orig_block_size;

  /**
    Maximum amount of memory this MEM_ROOT can hold. A value of 0
    implies there is no limit.
  */
  size_t m_max_capacity = 0;

  /**
   * Total allocated size for this MEM_ROOT. Does not include overhead
   * for block headers or malloc overhead, since especially the latter
   * is impossible to quantify portably.
   */
  size_t m_allocated_size = 0;

  /** If enabled, exceeding the capacity will lead to a my_error() call. */
  bool m_error_for_capacity_exceeded = false;

  void (*m_error_handler)(void) = nullptr;

  PSI_memory_key m_psi_key = 0;
}

英文的注释比较清晰，随便解释了几个，需要注意的是，这个init_alloc_root在注释中说明了，已经不推荐在新的代码中使用，它提供了其它几个方法，如重载NEW运算符：

static inline void init_alloc_root(PSI_memory_key key, MEM_ROOT *root,
                                   size_t block_size, size_t) {
  ::new (root) MEM_ROOT(key, block_size);
}

void free_root(MEM_ROOT *root, myf flags);

/**
 * Allocate an object of the given type. Use like this:
 *
 *   Foo *foo = new (mem_root) Foo();
 *
 * Note that unlike regular operator new, this will not throw exceptions.
 * However, it can return nullptr if the capacity of the MEM_ROOT has been
 * reached. This is allowed since it is not a replacement for global operator
 * new, and thus isn't used automatically by e.g. standard library containers.
 *
 * TODO: This syntax is confusing in that it could look like allocating
 * a MEM_ROOT using regular placement new. We should make a less ambiguous
 * syntax, e.g. new (On(mem_root)) Foo().
 */
inline void *operator new(
    size_t size, MEM_ROOT *mem_root,
    const std::nothrow_t &arg MY_ATTRIBUTE((unused)) = std::nothrow) noexcept {
  return mem_root->Alloc(size);
}

inline void *operator new[](
    size_t size, MEM_ROOT *mem_root,
    const std::nothrow_t &arg MY_ATTRIBUTE((unused)) = std::nothrow) noexcept {
  return mem_root->Alloc(size);
}

inline void operator delete(void *, MEM_ROOT *,
                            const std::nothrow_t &) noexcept {
  /* never called */
}

inline void operator delete[](void *, MEM_ROOT *,
                              const std::nothrow_t &) noexcept {
  /* never called */
}

template <class T>
inline void destroy(T *ptr) {
  if (ptr != nullptr) ptr->~T();
}

template <class T>
inline void destroy_array(T *ptr, size_t count) {
  static_assert(!std::is_pointer<T>::value,
                "You're trying to destroy an array of pointers, "
                "not an array of objects. This is probably not "
                "what you intended.");
  if (ptr != nullptr) {
    for (size_t i = 0; i < count; ++i) destroy(&ptr[i]);
  }
}

/*
 * For std::unique_ptr with objects allocated on a MEM_ROOT, you shouldn't use
 * Default_deleter; use this deleter instead.
 */
template <class T>
class Destroy_only {
 public:
  void operator()(T *ptr) const {
    destroy(ptr);
    TRASH(const_cast<std::remove_const_t<T> *>(ptr), sizeof(T));
  }
};

/** std::unique_ptr, but only destroying. */
template <class T>
using unique_ptr_destroy_only = std::unique_ptr<T, Destroy_only<T>>;

template <typename T, typename... Args>
unique_ptr_destroy_only<T> make_unique_destroy_only(MEM_ROOT *mem_root,
                                                    Args &&... args) {
  return unique_ptr_destroy_only<T>(new (mem_root)
                                        T(std::forward<Args>(args)...));
}

在MEM_ROOT中，分为两大块，即当前应用Block的内存和Free内存，它们通过链表组织起来，这和普通的内存池的应用是一致的。而且查看它的代码，其实就是调用NEW函数。在新的代码中，内存管理分为以下几部分：

//分配
std::pair<MEM_ROOT::Block *, size_t> MEM_ROOT::AllocBlock(
    size_t wanted_length, size_t minimum_length) {
  DBUG_TRACE;

  size_t length = wanted_length;
  if (m_max_capacity != 0) {
    size_t bytes_left;
    if (m_allocated_size > m_max_capacity) {
      bytes_left = 0;
    } else {
      bytes_left = m_max_capacity - m_allocated_size;
    }
    if (wanted_length > bytes_left) {
      if (m_error_for_capacity_exceeded) {
        my_error(EE_CAPACITY_EXCEEDED, MYF(0),
                 static_cast<ulonglong>(m_max_capacity));
        // NOTE: No early return; we will abort the query at the next safe
        // point. We also don't go down to minimum_length, as this will give a
        // new block on every subsequent Alloc() (of which there might be
        // many, since we don't know when the next safe point will be).
      } else if (minimum_length <= bytes_left) {
        // Make one final chunk with all that we have left.
        length = bytes_left;
      } else {
        // We don't have enough memory left to satisfy minimum_length.
        return {nullptr, 0};
      }
    }
  }

  Block *new_block = static_cast<Block *>(
      my_malloc(m_psi_key, length + ALIGN_SIZE(sizeof(Block)),
                MYF(MY_WME | ME_FATALERROR)));
  if (new_block == nullptr) {
    if (m_error_handler) (m_error_handler)();
    return {nullptr, 0};
  }

  m_allocated_size += length;

  // Make the default block size 50% larger next time.
  // This ensures O(1) total mallocs (assuming Clear() is not called).
  m_block_size += m_block_size / 2;
  return {new_block, length};
}

void *MEM_ROOT::AllocSlow(size_t length) {
  DBUG_TRACE;
  DBUG_PRINT("enter", ("root: %p", this));

  // We need to allocate a new block to satisfy this allocation;
  // otherwise, the fast path in Alloc() would not have sent us here.
  // We plan to allocate a block of <block_size> bytes; see if that
  // would be enough or not.
  if (length >= m_block_size || MEM_ROOT_SINGLE_CHUNKS) {
    // The next block we'd allocate would _not_ be big enough
    // (or we're in Valgrind/ASAN mode, and want everything in single chunks).
    // Allocate an entirely new block, not disturbing anything;
    // since the new block isn't going to be used for the next allocation
    // anyway, we can just as well keep the previous one.
    Block *new_block =
        AllocBlock(/*wanted_length=*/length, /*minimum_length=*/length).first;
    if (new_block == nullptr) return nullptr;

    if (m_current_block == nullptr) {
      // This is the only block, so it has to be the current block, too.
      // However, it will be full, so we won't be allocating from it
      // unless ClearForReuse() is called.
      new_block->prev = nullptr;
      m_current_block = new_block;
      m_current_free_end = pointer_cast<char *>(new_block) +
                           ALIGN_SIZE(sizeof(*new_block)) + length;
      m_current_free_start = m_current_free_end;
    } else {
      // Insert the new block in the second-to-last position.
      new_block->prev = m_current_block->prev;
      m_current_block->prev = new_block;
    }

    return pointer_cast<char *>(new_block) + ALIGN_SIZE(sizeof(*new_block));
  } else {
    // The normal case: Throw away the current block, allocate a new block,
    // and use that to satisfy the new allocation.
    if (ForceNewBlock(/*minimum_length=*/length)) {
      return nullptr;
    }
    char *new_mem = m_current_free_start;
    m_current_free_start += length;
    return new_mem;
  }
}

bool MEM_ROOT::ForceNewBlock(size_t minimum_length) {
  std::pair<Block *, size_t> block_and_length =
      AllocBlock(/*wanted_length=*/ALIGN_SIZE(m_block_size),
                 minimum_length);  // Will modify block_size.
  Block *new_block = block_and_length.first;
  if (new_block == nullptr) return true;

  new_block->prev = m_current_block;
  m_current_block = new_block;

  char *new_mem =
      pointer_cast<char *>(new_block) + ALIGN_SIZE(sizeof(*new_block));
  m_current_free_start = new_mem;
  m_current_free_end = new_mem + block_and_length.second;
  return false;
}

在上面的代码中其实有一个Alloc分配，这里还有一个块的分配，Slow表示无法使用Free的内存管理空间。而Force表示是直接使用内存分配。不走Free通道。

//回收和重置
void MEM_ROOT::Clear() {
  DBUG_TRACE;
  DBUG_PRINT("enter", ("root: %p", this));

  // Already cleared, or memset() to zero, so just ignore.
  if (m_current_block == nullptr) return;

  Block *start = m_current_block;

  m_current_block = nullptr;
  m_block_size = m_orig_block_size;
  m_current_free_start = &s_dummy_target;
  m_current_free_end = &s_dummy_target;
  m_allocated_size = 0;

  FreeBlocks(start);
}

void MEM_ROOT::ClearForReuse() {
  DBUG_TRACE;

  if (MEM_ROOT_SINGLE_CHUNKS) {
    Clear();
    return;
  }

  // Already cleared, or memset() to zero, so just ignore.
  if (m_current_block == nullptr) return;

  // Keep the last block, which is usually the biggest one.
  m_current_free_start = pointer_cast<char *>(m_current_block) +
                         ALIGN_SIZE(sizeof(*m_current_block));
  Block *start = m_current_block->prev;
  m_current_block->prev = nullptr;
  m_allocated_size = m_current_free_end - m_current_free_start;

  FreeBlocks(start);
}

void MEM_ROOT::FreeBlocks(Block *start) {
  // The MEM_ROOT might be allocated on itself, so make sure we don't
  // touch it after we've started freeing.
  for (Block *block = start; block != nullptr;) {
    Block *prev = block->prev;
    my_free(block);
    block = prev;
  }
}

另外还提供了一些辅助的分配和管理函数接口，更方便内存的管理应用：

void *multi_alloc_root(MEM_ROOT *root, ...) {
  va_list args;
  char **ptr, *start, *res;
  size_t tot_length, length;
  DBUG_TRACE;

  va_start(args, root);
  tot_length = 0;
  while ((ptr = va_arg(args, char **))) {
    length = va_arg(args, uint);
    tot_length += ALIGN_SIZE(length);
  }
  va_end(args);

  if (!(start = static_cast<char *>(root->Alloc(tot_length))))
    return nullptr; /* purecov: inspected */

  va_start(args, root);
  res = start;
  while ((ptr = va_arg(args, char **))) {
    *ptr = res;
    length = va_arg(args, uint);
    res += ALIGN_SIZE(length);
  }
  va_end(args);
  return (void *)start;
}

char *strdup_root(MEM_ROOT *root, const char *str) {
  return strmake_root(root, str, strlen(str));
}

char *safe_strdup_root(MEM_ROOT *root, const char *str) {
  return str ? strdup_root(root, str) : nullptr;
}

void free_root(MEM_ROOT *root, myf flags) {
  if (root != nullptr) {
    if ((flags & MY_MARK_BLOCKS_FREE) || (flags & MY_KEEP_PREALLOC))
      root->ClearForReuse();
    else
      root->Clear();
  }
}

char *strmake_root(MEM_ROOT *root, const char *str, size_t len) {
  char *pos;
  if ((pos = static_cast<char *>(root->Alloc(len + 1)))) {
    if (len > 0) memcpy(pos, str, len);
    pos[len] = 0;
  }
  return pos;
}

void *memdup_root(MEM_ROOT *root, const void *str, size_t len) {
  char *pos;
  if ((pos = static_cast<char *>(root->Alloc(len)))) {
    memcpy(pos, str, len);
  }
  return pos;
}

char MEM_ROOT::s_dummy_target;

主要是对根的操作，包括各种创建和释放等。在高版本里，增加对Block的操作，这个要注意。更多的信息可以参看：
https://dev.mysql.com/doc/dev/mysql-server/latest/structMEM__ROOT.html#details

四、总结

总体上来看，MYSQL的内存管理是中规中矩，没有太大的亮点。在中小数量和频繁应用的内存分配操作情况下，会有一个明显的效率提升，如果经常申请较大的内存，那么会直接跳到内存分配上去，那么也就没有什么优势了。其实就是这样，写任何程序，都是一个兼顾平衡的结果。全都照顾到，就是全都照顾不到。
突出重点，走出特点，才是痛点。
在这里插入图片描述