spark存储管理源码分析系列之MemoryStore

最新推荐文章于 2023-03-06 18:13:00 发布

小白数据猿

最新推荐文章于 2023-03-06 18:13:00 发布

阅读量562

点赞数

分类专栏： Spark 文章标签：大数据 spark

本文链接：https://blog.csdn.net/lidongmeng0213/article/details/109109817

版权

Spark 专栏收录该内容

42 篇文章 10 订阅

订阅专栏

内存池MemoryPool是对存储内存的具体管理，内存管理器MemoryManager是提供给外界进行管理内存的接口，而MemoryStore是用来将数据块保存到申请的storage内存中，并提供了从内存获取保存的数据的方法。在storage内存不足时，负责将内存中保存的数据刷新到磁盘上并释放占用的内存。MemoryStore在保存数据之前，会调用MemoryManager的相关acquire方法，判断StorageMemoryPool中是否有足够的内存可以分配，如果可用内存不足则直接返回false，由调用者调用BlockEvictionHandler.dropFromMemory来移除内存中缓存的数据块，释放内存空间。如果可用内存充足则直接将数据块保存到内存中。本文先介绍与MemoryStore相关的MemoryEntry，然后详细分析MemoryStore的主要源码。

MemoryEntry

MemoryEntry是块在内存中的抽象表示，定义如下:

// 内存中的Block抽象为特质MemoryEntry
private sealed trait MemoryEntry[T] {
  def size: Long // 当前Block的大小
  def memoryMode: MemoryMode // 当前Block的存储的内存类型
  def classTag: ClassTag[T] // 当前Block的类型标记
}

size表示块大小，memoryMode表示块存储在堆内内存还是堆外内存，classTag则是该块所存储的对象的类型标记。MemoryEntry有序列化和反序列化的两种实现，如下所示:

// 表示反序列化后的MemoryEntry
private case class DeserializedMemoryEntry[T](
    value: Array[T],
    size: Long,
    classTag: ClassTag[T]) extends MemoryEntry[T] {
  val memoryMode: MemoryMode = MemoryMode.ON_HEAP
}

// 表示序列化后的MemoryEntry
private case class SerializedMemoryEntry[T](
    buffer: ChunkedByteBuffer,
    memoryMode: MemoryMode,
    classTag: ClassTag[T]) extends MemoryEntry[T] {
  def size: Long = buffer.size
}

可以看到，反序列化的DeserializedMemoryEntry只能用堆内内存存储<ON_HEAP>，其数据是T类型的对象的数组。序列化的SerializedMemoryEntry能用堆内和堆外内存存储，数据用字节缓存ChunkedByteBuffer包装，并且其长度就是该SerializedMemoryEntry的大小。

ValueHolder

在以迭代器数据形式写入存储内存数据时候，插入数据最主要的工作是由ValuesHolder对象来完成的。ValuesHolder特质有两个实现类：DeserializedValuesHolder和SerializedValuesHolder，我们来简单分析下这两个类。

DeserializedValuesHolder对象内部有两个成员：vector，是一个SizeTrackingVector，可以估算数组中元素的大小，同时可以自动扩容；arrayValues，是一个存放值的数组，用于在所有数据插入后，将vector中数据转移到一个数组中，然后包装成一个DeserializedMemoryEntry对象<数据是T类型的对象数组>，工作大部分是由SizeTrackingVector来做的。
SerializedValuesHolder对象是对SerializedMemoryEntry对象构建的辅助类，使用包装的压缩流和序列化流，对数据进行序列化，压缩，然后写入到ChunkedByteBuffer中，最后包装成SerializedMemoryEntry,记录在MemoryStore中。

MemoryStore

MemoryStore依赖于MemoryManager，块写入时候，需要从MemoryManager中获取on-heap/off-heap的存储内存，分配给Block存储用；块删除时候，需要向MemoryManager归还相应占用的存储内存。

构造与属性成员

我们先来看一下MemoryStore构造方法和成员属性，如下所示：

private[spark] class MemoryStore(
    conf: SparkConf,
    blockInfoManager: BlockInfoManager, // 块元信息管理器
    serializerManager: SerializerManager, // 序列化
    memoryManager: MemoryManager, // 负责存储内存分配
    blockEvictionHandler: BlockEvictionHandler) // 负责从内存中删除占用执行内存的空间的块，将其存储到磁盘里面，释放空间
  extends Logging {
  // LRU map, 在删除时候可以根据访问时间进行删除最早未使用的Block
  private val entries = new LinkedHashMap[BlockId, MemoryEntry[_]](32, 0.75f, true)

  // TaskAttempt线程的标识TaskAttemptId与该TaskAttempt线程在堆内存展开的所有Block占用的内存大小之和之间的映射关系。
  private val onHeapUnrollMemoryMap = mutable.HashMap[Long, Long]()
   
  // TaskAttempt线程的标识TaskAttemptId与该TaskAttempt线程在堆外内存展开的所有Block占用的内存大小之和之间的映射关系
  private val offHeapUnrollMemoryMap = mutable.HashMap[Long, Long]() 
}

可以看出来，MemoryStore一共有8个属性：

conf: spark配置信息。
blockInfoManager: 负责块元信息管理。
serializerManager: 负责序列化处理。
memoryManager: 负责存储内存的分配和回收。
blockEvictionHandler: 驱逐块的特质，只有BlockManager里面实现了。
Entries：使用LinkedHashMap存储的BlockId->MemoryEntry的Map，这个数据结构内部实现了LRU，在删除时候会先删除最早未被访问过的块。
onHeapUnrollMemoryMap: 记录了TaskAttemptId与该TaskAttempt线程在堆内存展开的所有Block占用的内存大小之和之间的映射关系。
offHeapUnrollMemoryMap: 记录了TaskAttemptId与该TaskAttempt线程在堆外内存展开的所有Block占用的内存大小之和之间的映射关系。

直接写入字节

直接写入字节方法比较简单，首先通过MemoryManager方法申请所需的内存，然后调用参数中传入的偏函数_bytes，获取已经转化为ChunkedByteBuffer的数据，再创建出对应的SerializedMemoryEntry，并将该MemoryEntry放入entries映射。注意LinkedHashMap本身不是线程安全的，因此对其并发访问都要加锁。

def putBytes[T: ClassTag](
  blockId: BlockId,
  size: Long,
  memoryMode: MemoryMode,
  _bytes: () => ChunkedByteBuffer): Boolean = {
  require(!contains(blockId), s"Block $blockId is already present in the MemoryStore")
  if (memoryManager.acquireStorageMemory(blockId, size, memoryMode)) { // 申请空间，能申请到空间，可以从执行内存租借内存
    // We acquired enough memory for the block, so go ahead and put it
    val bytes = _bytes()  // 获取Block的数据，函数产出ChunkedByteBuffer<大多数情况下是将其他类型的数据变为ChunkedByteBuffer类型>
    assert(bytes.size == size)
    // 序列化
    val entry = new SerializedMemoryEntry[T](bytes, memoryMode, implicitly[ClassTag[T]])
    entries.synchronized { // 将Block数据写入entries，即写入内存
      entries.put(blockId, entry)
    }
    logInfo("Block %s stored as bytes in memory (estimated size %s, free %s)".format(
      blockId, Utils.bytesToString(size), Utils.bytesToString(maxMemory - blocksMemoryUsed)))
    true
  } else { // 空间不足
    false
  }
}

写入迭代器化的数据

迭代器化的数据，就是指用Iterator[T]形式表示的块数据。之所以会这样表示，是因为有时单个块对应的数据可能过大，不能一次性存入内存。为了避免造成OOM，就可以一边遍历迭代器，一边周期性地写内存，并检查内存是否够用，就像翻书一样。“展开”（Unroll）这个词形象地说明了该过程。不过unroll memory和storage memory本质上是同一份内存，只是在任务执行的不同阶段的不同逻辑表述形式。在从hdfs文件的partition数据的读取到存储内存过程中，这份内存叫做unroll memory，而当成功读取存储了所有record到内存中后，这份内存就改了个名字叫storage memory了。注意，unroll memory的概念只存在于spark的存储模块中，在执行模块中是不存在unroll memory的。我们先来看一下写入操作调用全过程:

ShuffleMapTask/ResultTask.runTask -> RDD.iterator -> RDD.getOrCompute -> BlockManager.getOrElseUpdate
	-> BlockManager.doPutIterator -> MemoryStore.putIteratorAsBytes -> MemoryStore.putIterator 
	-> MemoryStore.reserveUnrollMemoryForThisTask ->  MemoryManager.acquireUnrollMemory

可以看到，task[shuffle map task和result task]执行时调用RDD.iterator获取指定partition的数据迭代器，这个过程中的MemoryStore.putIterator会遍历指定partition的所有records，获取每个value并将其存放在连续内存中，下面我们来分析具体的写入过程。

非序列化方式写入

putIteratorAsValues这个方法主要是用于存储级别是非序列化的情况，即直接以java对象的形式将数据存放在jvm堆内存上。在jvm堆内存上存放大量的对象并不是什么好事，gc压力大，挤占内存，可能引起频繁的gc，但是也有明显的好处，就是省去了序列化和反序列化耗时，而且直接从堆内存取数据显然比任何其他方式（磁盘和直接内存）都要快很多，所以对于内存充足且要缓存的数据量不是很大的情况，是一种不错的选择。该方法使用DeserializedValuesHolder然后调用putIterator方法来进行具体的写入，这个后面在分析。该方法成功时候返回写入内存的数据大小，失败时候返回PartiallyUnrolledIterator供DiskStore写入磁盘操作。

private[storage] def putIteratorAsValues[T](
  blockId: BlockId,
  values: Iterator[T],
  classTag: ClassTag[T]): Either[PartiallyUnrolledIterator[T], Long] = { // 非序列化数据只能写入到堆内内存

  val valuesHolder = new DeserializedValuesHolder[T](classTag) // 使用sizeTracker来采样计算数据的size,Vector存储unroll的数据

  putIterator(blockId, values, classTag, MemoryMode.ON_HEAP, valuesHolder) match {
    case Right(storedSize) => Right(storedSize)
    case Left(unrollMemoryUsedByThisBlock) =>
    // 已经unroll的数据
    val unrolledIterator = if (valuesHolder.vector != null) {
      valuesHolder.vector.iterator
    } else {
      valuesHolder.arrayValues.toIterator
    }

    Left(new PartiallyUnrolledIterator(
      this,
      MemoryMode.ON_HEAP,
      unrollMemoryUsedByThisBlock, // 未读取的
      unrolled = unrolledIterator, // 已经读取到内存的iterator
      rest = values))
  }
}

序列化方式写入

putIteratorAsBytes的实现结构基本和putIteratorAsValues是一样的。只不过这里的序列化形式存储使用的是SerializedMemoryEntry，valueHolder也选择了SerializedValuesHolder来进行，指定chunk的大小和存储内存类型，进行序列化写入，成功时候返回写入的大小，失败时候返回的是PartiallySerializedBlock，供DiskStore写入磁盘操作。

private[storage] def putIteratorAsBytes[T](
  blockId: BlockId,
  values: Iterator[T],
  classTag: ClassTag[T],
  memoryMode: MemoryMode): Either[PartiallySerializedBlock[T], Long] = {

  require(!contains(blockId), s"Block $blockId is already present in the MemoryStore")

  // Initial per-task memory to request for unrolling blocks (bytes).
  val initialMemoryThreshold = unrollMemoryThreshold
  val chunkSize = if (initialMemoryThreshold > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
    logWarning(s"Initial memory threshold of ${Utils.bytesToString(initialMemoryThreshold)} " +
               s"is too large to be set as chunk size. Chunk size has been capped to " +
               s"${Utils.bytesToString(ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH)}")
    ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH
  } else {
    initialMemoryThreshold.toInt
  }

  // 使用非序列化valueHolder，可以在堆外/堆内内存申请空间
  val valuesHolder = new SerializedValuesHolder[T](blockId, chunkSize, classTag,
                                                   memoryMode, serializerManager)

  putIterator(blockId, values, classTag, memoryMode, valuesHolder) match {
    case Right(storedSize) => Right(storedSize)
    case Left(unrollMemoryUsedByThisBlock) =>
    Left(new PartiallySerializedBlock(
      this,
      serializerManager,
      blockId,
      valuesHolder.serializationStream,
      valuesHolder.redirectableStream,
      unrollMemoryUsedByThisBlock,
      memoryMode,
      valuesHolder.bbos,
      values,
      classTag))
  }
}

putIterator分析

从上面看到序列化方式以及非序列化方式写入到内存中都调用了putIterator来进行具体的写入操作，这个方法很长，但是逻辑相对简单，主要做的事情就是把数据一条一条往ValuesHolder中写，并周期性地检查内存，如果内存不够就通过内存管理器MemoryManager申请内存，每次申请当前内存量的1.5倍。最后，将ValuesHolder中的数据转移到一个数组中。最后还有关键的一步，就是释放展开内存，重新申请存储内存，主要步骤如下：

调用reserveUnrollMemoryForThisTask()，申请初始的展开内存，并记录该块使用了多少展开内存。
循环迭代块的数据，将其放入一个valueHolder中。
每当到了检查的时机<16个元素一检查>，如果已经展开的数据大小超过了当前的展开内存阈值，就再次调用reserveUnrollMemoryForThisTask()方法，试图申请新的展开内存，申请到之后，同时更新阈值。
所有数据都展开之后，标志keepUnrolling为真，表示展开成功。将valueHolder中的数据封装为MemoryEntry。
如果检查申请到的展开内存是否比实际大小还大，就释放掉多余的展开内存，并将它们返还给存储内存。
上面一切成功，将块BlockId与MemoryEntry的映射放入entries，并返回Right。注意这个方法返回值的类型是Either类型，它在Scala中表示不相交的两个结果的集合，即可能返回错误的结果（Left），或者正确的结果（Right）。
如果没有足够的展开内存，或者展开所有数据后keepUnrolling标志为假，都表示这次写入不成功，返回Left，其中又包含PartiallyUnrolledIterator，表示一个没有完全展开的迭代器。

 private def putIterator[T](
      blockId: BlockId,
      values: Iterator[T], // 需要写入的数据
      classTag: ClassTag[T],
      memoryMode: MemoryMode,
      valuesHolder: ValuesHolder[T]): Either[Long, Long] = {
    require(!contains(blockId), s"Block $blockId is already present in the MemoryStore")
 
    // 已经展开的元素数量。
    var elementsUnrolled = 0 
    // MemoryStore是否仍然有足够的内存，以便于继续展开Block。
    var keepUnrolling = true 
    // unrollMemoryThreshold 用来展开任何Block之前，初始请求的内存大小，可以修改属性spark.storage.unrollMemoryThreshold（默认为1MB）改变大小
    val initialMemoryThreshold = unrollMemoryThreshold
    // 检查内存是否足够的阀值，此值默认为16。即每展开16个元素就检查一次。
    val memoryCheckPeriod = conf.get(UNROLL_MEMORY_CHECK_PERIOD)
    // 当前任务用于展开Block所保留的内存。
    var memoryThreshold = initialMemoryThreshold
    // 展开内存不充足时，请求增长的因子。此值默认为1.5。
    val memoryGrowthFactor = conf.get(UNROLL_MEMORY_GROWTH_FACTOR)
    // Block已经使用的展开内存大小计数器
    var unrollMemoryUsedByThisBlock = 0L

    // 请求足够的内存开始展开操作，默认为unrollMemoryThreshold，即1M
    keepUnrolling =
      reserveUnrollMemoryForThisTask(blockId, initialMemoryThreshold, memoryMode)

    if (!keepUnrolling) {// 无法请求到足够的初始内存，记录日志
      logWarning(s"Failed to reserve initial memory threshold of " +
        s"${Utils.bytesToString(initialMemoryThreshold)} for computing block $blockId in memory.")
    } else { // 将申请到的内存添加到已使用的展开内存计数器中
      unrollMemoryUsedByThisBlock += initialMemoryThreshold
    }
   
    while (values.hasNext && keepUnrolling) {  // 如果还有元素，且申请到了足够的初始内存
      valuesHolder.storeValue(values.next())  // 将下一个元素添加到vector进行记录
      if (elementsUnrolled % memoryCheckPeriod == 0) { // 判断是否需要检查内存是否足够
        val currentSize = valuesHolder.estimatedSize() // 所有已经分配的内存 sizeTracker估算大小
        // If our vector's size has exceeded the threshold, request more memory
        if (currentSize >= memoryThreshold) { // 所有已经分配的内存大于为当前展开保留的内存
          // 计算还需要请求的内存大小
          val amountToRequest = (currentSize * memoryGrowthFactor - memoryThreshold).toLong
          // 尝试申请更多内存
          keepUnrolling =
            reserveUnrollMemoryForThisTask(blockId, amountToRequest, memoryMode)
          if (keepUnrolling) { // 申请成功，将申请到的内存添加到已使用的展开内存计数器中，申请不成功时候下次循环就没办法了
            unrollMemoryUsedByThisBlock += amountToRequest
          }
          // New threshold is currentSize * memoryGrowthFactor
          // 更新为当前展开保留的内存大小
          memoryThreshold += amountToRequest
        }
      }
      // 完成了一次元素展开，展开个数加1
      elementsUnrolled += 1
    }

    // unroll是将不连续的内存<比方从文件中读取的数据iterator>存储到连续内存中<存储内存>
    if (keepUnrolling) { // 走到这里，说明计算的申请内存是足够的
      val entryBuilder = valuesHolder.getBuilder() // 构造器Block MemoryEntry
      val size = entryBuilder.preciseSize // sizeTracker估算出来的大小
      if (size > unrollMemoryUsedByThisBlock) { // 如果不足需要再次申请
        val amountToRequest = size - unrollMemoryUsedByThisBlock
        keepUnrolling = reserveUnrollMemoryForThisTask(blockId, amountToRequest, memoryMode)
        if (keepUnrolling) { // 申请成功
          unrollMemoryUsedByThisBlock += amountToRequest
        }
      }

      if (keepUnrolling) {
        val entry = entryBuilder.build()
        // Synchronize so that transfer is atomic
        memoryManager.synchronized { // 将展开Block的内存转换为存储Block的内存的方法
          releaseUnrollMemoryForThisTask(memoryMode, unrollMemoryUsedByThisBlock) // 先释放
          val success = memoryManager.acquireStorageMemory(blockId, entry.size, memoryMode) // 在申请真正size大小的存储内存
          assert(success, "transferring unroll memory to storage memory failed")
        }

        entries.synchronized {  // 将对应的映射关系添加到entries字典
          entries.put(blockId, entry)
        }

        logInfo("Block %s stored as values in memory (estimated size %s, free %s)".format(blockId,
          Utils.bytesToString(entry.size), Utils.bytesToString(maxMemory - blocksMemoryUsed)))
        Right(entry.size)
      } else { // 已经完全unroll了,size是预估的，可能比实际的要少，需要再次向存储内存申请，但是存储内存不足，导致无法最终保存
        logUnrollFailureMessage(blockId, entryBuilder.preciseSize)
        Left(unrollMemoryUsedByThisBlock)
      }
    } else { // 存储内存不足，导致无法unroll成功，只有部分unroll
      logUnrollFailureMessage(blockId, valuesHolder.estimatedSize())
      Left(unrollMemoryUsedByThisBlock)
    }
  }

读取块数据

getBytes对应的是读取SerializedMemoryEntry数据。getValues对应的是读取DeserializedMemoryEntry数据。

/** 获取数据 */
def getBytes(blockId: BlockId): Option[ChunkedByteBuffer] = {
  val entry = entries.synchronized { entries.get(blockId) }
  entry match {
    case null => None
    case e: DeserializedMemoryEntry[_] =>
    throw new IllegalArgumentException("should only call getBytes on serialized blocks")
    case SerializedMemoryEntry(bytes, _, _) => Some(bytes)
  }
}

// 用于从内存中读取BlockId对应的Block（已经封装为Iterator）。
def getValues(blockId: BlockId): Option[Iterator[_]] = {
  val entry = entries.synchronized { entries.get(blockId) }
  entry match {
    case null => None
    case e: SerializedMemoryEntry[_] =>
    throw new IllegalArgumentException("should only call getValues on deserialized blocks")
    case DeserializedMemoryEntry(values, _, _) =>
    val x = Some(values)
    x.map(_.iterator)
  }
}

淘汰缓存块

当存储内存不足或者执行内存不足时候，都可能需要淘汰缓存块。其执行流程如下：

循环遍历entries映射中的块，找出其中能够被淘汰的块。能够淘汰的Block需要满足: 1.该Block使用的内存模式与申请的相同；2. BlockId对应的Block不是RDD，或者BlockId与blockId不是同一个RDD。
为这些块加写锁，保证当前正在被读取的块不会被淘汰掉，记录将要被淘汰的块ID。
如果腾出的空间已经达到了目标值，就调用嵌套定义的dropBlock()方法真正地移除这些块，最终仍然调用了BlockManager.dropFromMemory()方法。该方法会产生两种结果：一是块仍然存在，只是StorageLevel发生变化（比如转存到了磁盘），就只需解开它的写锁；二是块被彻底地移除，就得调用BlockInfoManager.remove()方法删掉它。最后将剩余未处理的块解锁。
如果腾出的空间最终仍然不能达到目标值，就不会执行淘汰动作，新的块也不会被存入。

  private[spark] def evictBlocksToFreeSpace(
      blockId: Option[BlockId],
      space: Long,
      memoryMode: MemoryMode): Long = {
    assert(space > 0)
    memoryManager.synchronized {
      var freedMemory = 0L
      val rddToAdd = blockId.flatMap(getRddId)
      val selectedBlocks = new ArrayBuffer[BlockId]
      // 能够驱逐的Block需要满足: 1.该Block使用的内存模式与申请的相同。2. BlockId对应的Block不是RDD，或者BlockId与blockId不是同一个RDD。
      def blockIsEvictable(blockId: BlockId, entry: MemoryEntry[_]): Boolean = {
        entry.memoryMode == memoryMode && (rddToAdd.isEmpty || rddToAdd != getRddId(blockId))
      } 
      entries.synchronized {
        val iterator = entries.entrySet().iterator() // 遍历所有的，由于该数据结构底层使用的LRU，所以遍历顺序是从最远未使用的开始
        while (freedMemory < space && iterator.hasNext) {
          val pair = iterator.next()
          val blockId = pair.getKey
          val entry = pair.getValue
          if (blockIsEvictable(blockId, entry)) { // 判断是否满足驱逐条件 
            if (blockInfoManager.lockForWriting(blockId, blocking = false).isDefined) { // blockInfo需要获取写锁
              selectedBlocks += blockId // 需要驱逐的BlockId
              freedMemory += pair.getValue.size // 该Block占用的内存
            }
          }
        }
      }

      // 删除一个块
      def dropBlock[T](blockId: BlockId, entry: MemoryEntry[T]): Unit = {
        val data = entry match {
          case DeserializedMemoryEntry(values, _, _) => Left(values)
          case SerializedMemoryEntry(buffer, _, _) => Right(buffer)
        }
        // 该handler在BlockManager中实现
        val newEffectiveStorageLevel =
          blockEvictionHandler.dropFromMemory(blockId, () => data)(entry.classTag)
        if (newEffectiveStorageLevel.isValid) { 
          blockInfoManager.unlock(blockId) // 不能删除，释放写锁
        } else { 
          blockInfoManager.removeBlock(blockId) // 删除
        }
      }

      if (freedMemory >= space) { // 能够释放的空间大于需要的空间
        var lastSuccessfulBlock = -1
        try {
          logInfo(s"${selectedBlocks.size} blocks selected for dropping " +
            s"(${Utils.bytesToString(freedMemory)} bytes)")
          (0 until selectedBlocks.size).foreach { idx => // 遍历删除
            val blockId = selectedBlocks(idx)
            val entry = entries.synchronized {
              entries.get(blockId)
            } 
            if (entry != null) {
              dropBlock(blockId, entry)
              afterDropAction(blockId)
            }
            lastSuccessfulBlock = idx
          }
          logInfo(s"After dropping ${selectedBlocks.size} blocks, " +
            s"free memory is ${Utils.bytesToString(maxMemory - blocksMemoryUsed)}")
          freedMemory
        } finally { 
          if (lastSuccessfulBlock != selectedBlocks.size - 1) { 
            (lastSuccessfulBlock + 1 until selectedBlocks.size).foreach { idx =>
              val blockId = selectedBlocks(idx) // 没删除的需要释放写锁
              blockInfoManager.unlock(blockId)
            }
          }
        }
      } else { // 不满足需要的内存
        blockId.foreach { id =>
          logInfo(s"Will not store $id")
        }
        // 释放写锁
        selectedBlocks.foreach { id =>
          blockInfoManager.unlock(id)
        }
        0L
      }
    }
  }

预留&归还内存

预留和归还内存比较简单，主要是申请内存，然后对onHeapUnrollMemoryMap/offHeapUnrollMemoryMap进行操作，记录TaskId和使用的展开内存的对应关系。

def reserveUnrollMemoryForThisTask(
  blockId: BlockId,
  memory: Long,
  memoryMode: MemoryMode): Boolean = {
  memoryManager.synchronized {
    // 获取memoryMode的内存
    val success = memoryManager.acquireUnrollMemory(blockId, memory, memoryMode)
    if (success) {
      val taskAttemptId = currentTaskAttemptId()
      val unrollMemoryMap = memoryMode match {
        case MemoryMode.ON_HEAP => onHeapUnrollMemoryMap
        case MemoryMode.OFF_HEAP => offHeapUnrollMemoryMap
      }
      // 记录当前TaskId占用的内存大小
      unrollMemoryMap(taskAttemptId) = unrollMemoryMap.getOrElse(taskAttemptId, 0L) + memory
    }
    success
  }
}

def releaseUnrollMemoryForThisTask(memoryMode: MemoryMode, memory: Long = Long.MaxValue): Unit = {
  val taskAttemptId = currentTaskAttemptId()
  memoryManager.synchronized { // 释放空间
    val unrollMemoryMap = memoryMode match {
      case MemoryMode.ON_HEAP => onHeapUnrollMemoryMap
      case MemoryMode.OFF_HEAP => offHeapUnrollMemoryMap
    }
    if (unrollMemoryMap.contains(taskAttemptId)) {
      val memoryToRelease = math.min(memory, unrollMemoryMap(taskAttemptId))
      if (memoryToRelease > 0) {
        unrollMemoryMap(taskAttemptId) -= memoryToRelease
        memoryManager.releaseUnrollMemory(memoryToRelease, memoryMode)
      }
      if (unrollMemoryMap(taskAttemptId) == 0) {
        unrollMemoryMap.remove(taskAttemptId)
      }
    }
  }
}

参考

小白数据猿

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
spark存储管理源码分析系列之MemoryStore

内存池MemoryPool是对存储内存的具体管理，内存管理器MemoryManager是提供给外界进行管理内存的接口，而MemoryStore是用来将数据块保存到申请的storage内存中，并提供了从内存获取保存的数据的方法。在storage内存不足时，负责将内存中保存的数据刷新到磁盘上并释放占用的内存。MemoryStore在保存数据之前，会调用MemoryManager的相关acquire方法，判断StorageMemoryPool中是否有足够的内存可以分配，如果可用内存不足则直接返回false，由调.
复制链接

扫一扫