Spark调优指南(四)-内存管理

最新推荐文章于 2023-09-04 15:23:00 发布

潇洒-人生

最新推荐文章于 2023-09-04 15:23:00 发布

阅读量380

点赞数

分类专栏：大数据 spark

本文链接：https://blog.csdn.net/qq_35744460/article/details/90341816

版权

大数据同时被 2 个专栏收录

45 篇文章 0 订阅

订阅专栏

spark

17 篇文章 0 订阅

订阅专栏

Spark调优指南(四)-内存管理

官网介绍

Memory Management Overview
内存管理
Memory usage in Spark largely falls under one of two categories:
Spark 内存大致有2种
execution and storage.
执行和存储
Execution memory refers to that used for computation in shuffles,
joins, sorts and aggregations,
执行内存一般用于shuffles joins sorts aggregations
while storage memory refers to that used for caching and propagating internal data across the cluster.
存储内存用于集群中缓存 传播内部数据
In Spark, execution and storage share a unified region (M).
在Spark中执行和存储内共享一个同一区域(M)
When no execution memory is used, storage can acquire all the available memory and vice versa. 
当没有使用执行内存时，storage获取所有可用内存，反之亦然
Execution may evict storage if necessary, 
如果有必要Execution可以 占用storage空间
but only until total storage memory usage falls under a certain threshold (R).
但是仅限于总内存使用低于一个阈值（R）
In other words, R describes a subregion within M where cached blocks are never evicted. 
换句话说，r是M的一个子区域，缓存快不会被驱逐
Storage may not evict execution due to complexities in implementation.
Storage可能不会被驱逐，因为条件复杂
This design ensures several desirable properties.
这种设计确保了几种特性，
First, applications that do not use caching can use the entire space for execution,
第一不使用缓存的应用程序可以使用整个execution的空间
obviating unnecessary disk spills. 
从而避免不必要的磁盘溢出，
Second, applications that do use caching can reserve a minimum storage space (R) where their data blocks are immune to being evicted.
第二，使用缓存的应用程序可以保留最小的存储空间（R）数据块不收驱逐
Lastly, this approach provides reasonable out-of-the-box performance for a variety of workloads without requiring user expertise of how memory is divided internally.
最后，这种方法为各种工作负载提供了合理的开箱即用性能而无需用户内部划分内存的专业知识。
Although there are two relevant configurations, 
虽然有2种相关配置，
the typical user should not need to adjust them as the default values are applicable to most workloads:
不需要特殊修改，默认值适用于大部分工作情况
spark.memory.fraction expresses the size of M as a fraction of the (JVM heap space - 300MB) (default 0.6).
spark.memory.fraction 表示M（JVM堆空间 - 300MB）的一部分（默认0.6）
 The rest of the space (40%) is reserved for user data structures, internal metadata in Spark,
 剩下的空间（40％）保留用于用户数据结构、Spark中的元数据
and safeguarding against OOM errors in the case of sparse and unusually large records.
当有特殊情况，或者大的数据时可以防止OOM
spark.memory.storageFraction expresses the size of R as a fraction of M (default 0.5).
spark.memory.storageFraction 表示大小R（默认0.5）的一小部分，
R is the storage space within M where cached blocks immune to being evicted by execution.
R是 M storage 不受execution驱逐的存储空间
The value of spark.memory.fraction should be set in order to fit this amount of heap space comfortably within the JVM’s old or “tenured” generation.
spark.memory.fraction应该设置值，以便在JVM的旧版或“终身”代中舒适地适应这个堆空间量。有关详细信息，
See the discussion of advanced GC tuning below for details.
参阅下面的高级GC调整讨论。

MemoryManager

* An abstract memory manager that enforces how memory is shared between execution and storage.
 *一个抽象内存管理，用于execution和storage怎样共享内存
 * In this context, execution memory refers to that used for computation in shuffles, joins,
**
	 execution 指的是shuffles joins  sorts aggregations
 * sorts and aggregations, while storage memory refers to that used for caching and propagating
	storage指的是在集群中caching propagating
 * internal data across the cluster. There exists one MemoryManager per JVM.
	每个JVM有一个memoryManager
 */
private[spark] abstract class MemoryManager(
    conf: SparkConf,
    numCores: Int,
    onHeapStorageMemory: Long,
    onHeapExecutionMemory: Long) extends Logging {

 /**
   * Total available on heap memory for storage, in bytes. This amount can vary over time,
   堆内存上用于存储的总内存 可能随时间变化
   * depending on the MemoryManager implementation.
   取决于MemoryManager的实现
   * In this model, this is equivalent to the amount of memory not occupied by execution.
   在这个模型中，这相当于没有被execution占用的内存量
   */
  def maxOnHeapStorageMemory: Long

  /**
   * Total available off heap memory for storage, in bytes. This amount can vary over time,
   * depending on the MemoryManager implementation.
   用于storage总的堆外内存，可能随时间变化取决于MemoryManager实现
   */
  def maxOffHeapStorageMemory: Long

  /**
   * Execution memory currently in use, in bytes.
   */
  final def executionMemoryUsed: Long = synchronized {
    onHeapExecutionMemoryPool.memoryUsed + offHeapExecutionMemoryPool.memoryUsed
  }

  /**
   * Storage memory currently in use, in bytes.
   */
  final def storageMemoryUsed: Long = synchronized {
    onHeapStorageMemoryPool.memoryUsed + offHeapStorageMemoryPool.memoryUsed
  }

StaticMemoryManager

**
 * A [[MemoryManager]] that statically partitions the heap space into disjoint regions.
 *将堆空间静态划分为不相交区域的[[内存管理器]
 * The sizes of the execution and storage regions are determined through
 * `spark.shuffle.memoryFraction` and `spark.storage.memoryFraction` respectively. The two
 * regions are cleanly separated such that neither usage can borrow memory from the other.
 执行区域和存储区域的大小分别是通过spark.shuffle.memoryfraction和spark.storage.memoryfraction
 两个区域被清晰地分开，这样两种用法都不能从另一种用法借用内存。
 */
private[spark] class StaticMemoryManager(
    conf: SparkConf,
    maxOnHeapExecutionMemory: Long,
    override val maxOnHeapStorageMemory: Long,
    numCores: Int)
  extends MemoryManager(
    conf,
    numCores,
    maxOnHeapStorageMemory,
    maxOnHeapExecutionMemory) {
//The StaticMemoryManager does not support off-heap storage memory
StaticMemoryManager不支持堆外存储内存

/**
   * Return the total amount of memory available for the storage region, in bytes.
   */
   //获取最大Storage
   //systemMaxMemory以10G为例子
  private def getMaxStorageMemory(conf: SparkConf): Long = {
    val systemMaxMemory = conf.getLong("spark.testing.memory", Runtime.getRuntime.maxMemory)
    val memoryFraction = conf.getDouble("spark.storage.memoryFraction", 0.6)
    val safetyFraction = conf.getDouble("spark.storage.safetyFraction", 0.9)
     10 * 0.6 * 0.9 = 5.4g 最大就是5.4
    (systemMaxMemory * memoryFraction * safetyFraction).toLong
  }

  /**
   * Return the total amount of memory available for the execution region, in bytes.
   */
   //获取最大Execution
   //也以10G为例子最大1.6G
  private val MIN_MEMORY_BYTES = 32 * 1024 * 1024 32M
  private def getMaxExecutionMemory(conf: SparkConf): Long = {
    val systemMaxMemory = conf.getLong("spark.testing.memory", Runtime.getRuntime.maxMemory)

    if (systemMaxMemory < MIN_MEMORY_BYTES) {
      throw new IllegalArgumentException(s"System memory $systemMaxMemory must " +
        s"be at least $MIN_MEMORY_BYTES. Please increase heap size using the --driver-memory " +
        s"option or spark.driver.memory in Spark configuration.")
    }
    if (conf.contains("spark.executor.memory")) {
      val executorMemory = conf.getSizeAsBytes("spark.executor.memory")
      if (executorMemory < MIN_MEMORY_BYTES) {
        throw new IllegalArgumentException(s"Executor memory $executorMemory must be at least " +
          s"$MIN_MEMORY_BYTES. Please increase executor memory using the " +
          s"--executor-memory option or spark.executor.memory in Spark configuration.")
      }
    }
    val memoryFraction = conf.getDouble("spark.shuffle.memoryFraction", 0.2)
    val safetyFraction = conf.getDouble("spark.shuffle.safetyFraction", 0.8)
    10 * 0.2 * 0.8 = 1.6G
    (systemMaxMemory * memoryFraction * safetyFraction).toLong
  }

UnifiedMemoryManager


  /**
 * A [[MemoryManager]] that enforces a soft boundary between execution and storage such that
 * either side can borrow memory from the other.
 * 意思时 execution storage 之间可以相互借用内存
 * The region shared between execution and storage is a fraction of (the total heap space - 300MB)
 * configurable through `spark.memory.fraction` (default 0.6). The position of the boundary
 * within this space is further determined by `spark.memory.storageFraction` (default 0.5).
 * This means the size of the storage region is 0.6 * 0.5 = 0.3 of the heap space by default.
 *execution storage 的中内存是 总的堆内存-300M
 * Storage can borrow as much execution memory as is free until execution reclaims its space.
 * When this happens, cached blocks will be evicted from memory until sufficient borrowed
 * memory is released to satisfy the execution memory request.
 *Storage可以一直借用空闲的execution 内存直到execution收回空间
 发生这种情况时，缓存块将从内存中移出，直到有足够的借用量为止释放内存以满足执行内存请求
 * Similarly, execution can borrow as much storage memory as is free. However, execution
 * memory is *never* evicted by storage due to the complexities involved in implementing this.
 * The implication is that attempts to cache blocks may fail if execution has already eaten
 * up most of the storage space, in which case the new blocks will be evicted immediately
 * according to their respective storage levels.
 同样execution可以借用storage内存，但是 execution比较复杂 storage不能收回内存
 *如果执行已经占用了大部分存储空间，则尝试缓存块可能会失败，在这种情况下，新块将根据其各自的存储级别立即被收回。
 * @param onHeapStorageRegionSize Size of the storage region, in bytes.
 *                          This region is not statically reserved; execution can borrow from
 *                          it if necessary. Cached blocks can be evicted only if actual
 *                          storage memory usage exceeds this region.
 */
private[spark] class UnifiedMemoryManager private[memory] (
    conf: SparkConf,
    val maxHeapMemory: Long,
    onHeapStorageRegionSize: Long,
    numCores: Int)
  extends MemoryManager(
    conf,
    numCores,
    onHeapStorageRegionSize,
    maxHeapMemory - onHeapStorageRegionSize) {
/**
   * Try to acquire up to `numBytes` of execution memory for the current task and return the
   * number of bytes obtained, or 0 if none can be allocated.
   *
   * This call may block until there is enough free memory in some situations, to make sure each
   * task has a chance to ramp up to at least 1 / 2N of the total memory pool (where N is the # of
   * active tasks) before it is forced to spill. This can happen if the number of tasks increase
   * but an older task had a lot of memory already.
   尝试为当前任务获取最多“numbytes”的执行内存，并返回获得的字节数，如果无法分配，则返回0。
此调用可能会阻塞，直到在某些情况下有足够的可用内存，
以确保每个任务有机会在强制溢出之前至少提升到总内存池的1/2n（其中n是活动任务的）。
如果任务数量增加，但较旧的任务已有大量内存，则可能发生这种情况。
   */
  override private[memory] def acquireExecutionMemory(
      numBytes: Long,
      taskAttemptId: Long,
      memoryMode: MemoryMode): Long = synchronized {
    assertInvariants()
    assert(numBytes >= 0)
    val (executionPool, storagePool, storageRegionSize, maxMemory) = memoryMode match {
      case MemoryMode.ON_HEAP => (
        onHeapExecutionMemoryPool,
        onHeapStorageMemoryPool,
        onHeapStorageRegionSize,
        maxHeapMemory)
      case MemoryMode.OFF_HEAP => (
        offHeapExecutionMemoryPool,
        offHeapStorageMemoryPool,
        offHeapStorageMemory,
        maxOffHeapMemory)
    }

 /**
     * Grow the execution pool by evicting cached blocks, thereby shrinking the storage pool.
     *通过移出缓存块来增加execution，从而缩小storage
     * When acquiring memory for a task, the execution pool may need to make multiple
     * attempts. Each attempt must be able to evict storage in case another task jumps in
     * and caches a large block between the attempts. This is called once per attempt.
     获取任务内存时，执行池可能需要多次尝试。
     每次尝试都必须能够收回存储，以防另一个任务在尝试之间跳入并缓存一个大的块。每次尝试调用一次。
     */
    def maybeGrowExecutionPool(extraMemoryNeeded: Long): Unit = {
      if (extraMemoryNeeded > 0) {
        // There is not enough free memory in the execution pool, so try to reclaim memory from
        // storage. We can reclaim any free memory from the storage pool. If the storage pool
        // has grown to become larger than `storageRegionSize`, we can evict blocks and reclaim
        // the memory that storage has borrowed from execution.
	execution中没有足够的可用内存，请尝试从存储中回收内存。
	我们可以从storage中回收任何可用内存。如果存储池变得大于“storageregionsize”，
	我们可以收回块并回收存储从执行中借用的内存。
        val memoryReclaimableFromStorage = math.max(
          storagePool.memoryFree,
          storagePool.poolSize - storageRegionSize)
        if (memoryReclaimableFromStorage > 0) {
          // Only reclaim as much space as is necessary and available:
          val spaceToReclaim = storagePool.freeSpaceToShrinkPool(
            math.min(extraMemoryNeeded, memoryReclaimableFromStorage))
          storagePool.decrementPoolSize(spaceToReclaim)
          executionPool.incrementPoolSize(spaceToReclaim)
        }
      }
    }
 /**
     * The size the execution pool would have after evicting storage memory.
     *
     * The execution memory pool divides this quantity among the active tasks evenly to cap
     * the execution memory allocation for each task. It is important to keep this greater
     * than the execution pool size, which doesn't take into account potential memory that
     * could be freed by evicting storage. Otherwise we may hit SPARK-12155.
     *
     * Additionally, this quantity should be kept below `maxMemory` to arbitrate fairness
     * in execution memory allocation across tasks, Otherwise, a task may occupy more than
     * its fair share of execution memory, mistakenly thinking that other tasks can acquire
     * the portion of storage memory that cannot be evicted.
     */
     取出storage内存后执行池的大小。
     执行内存池将此数量平均分配给活动任务，以限制每个任务的执行内存分配。
     保持这个值大于执行池的大小是很重要的，因为执行池大小不考虑通过移出存储可以释放的潜在内存
     。否则我们可能会撞上火花-12155。
     此外，此数量应保持在“maxmemory”以下，以仲裁任务间执行内存分配的公平性
     ，否则，任务可能会占用执行内存的公平份额，错误地认为其他任务可以获取无法收回的存储内存部分。
    def computeMaxExecutionPoolSize(): Long = {
      maxMemory - math.min(storagePool.memoryUsed, storageRegionSize)
    }

    executionPool.acquireMemory(
      numBytes, taskAttemptId, maybeGrowExecutionPool, computeMaxExecutionPoolSize)
  }



     // Set aside a fixed amount of memory for non-storage, non-execution purposes.
  // This serves a function similar to `spark.memory.fraction`, but guarantees that we reserve
  // sufficient memory for the system even for small heaps. E.g. if we have a 1GB JVM, then
  // the memory used for execution and storage will be (1024 - 300) * 0.6 = 434MB by default.
  系统预留量300m
  private val RESERVED_SYSTEM_MEMORY_BYTES = 300 * 1024 * 1024

  /**
   * Return the total amount of memory shared between execution and storage, in bytes.
   返回execution and storage总量
   */
  private def getMaxMemory(conf: SparkConf): Long = {
    10g
    val systemMemory = conf.getLong("spark.testing.memory", Runtime.getRuntime.maxMemory)
    300m
    val reservedMemory = conf.getLong("spark.testing.reservedMemory",
      if (conf.contains("spark.testing")) 0 else RESERVED_SYSTEM_MEMORY_BYTES)
    450m
    val minSystemMemory = (reservedMemory * 1.5).ceil.toLong 
    if (systemMemory < minSystemMemory) {
      throw new IllegalArgumentException(s"System memory $systemMemory must " +
        s"be at least $minSystemMemory. Please increase heap size using the --driver-memory " +
        s"option or spark.driver.memory in Spark configuration.")
    }
    // SPARK-12759 Check executor memory to fail fast if memory is insufficient
    if (conf.contains("spark.executor.memory")) {
      val executorMemory = conf.getSizeAsBytes("spark.executor.memory")
      if (executorMemory < minSystemMemory) {
        throw new IllegalArgumentException(s"Executor memory $executorMemory must be at least " +
          s"$minSystemMemory. Please increase executor memory using the " +
          s"--executor-memory option or spark.executor.memory in Spark configuration.")
      }
    }
    10g-300m
    val usableMemory = systemMemory - reservedMemory
    (10g-300m)*0.6
    val memoryFraction = conf.getDouble("spark.memory.fraction", 0.6)
    (usableMemory * memoryFraction).toLong
  }

  def apply(conf: SparkConf, numCores: Int): UnifiedMemoryManager = {
  (10g-300m)*0.6
    val maxMemory = getMaxMemory(conf)
    new UnifiedMemoryManager(
      conf,
      maxHeapMemory = maxMemory,
      onHeapStorageRegionSize =
	(10g-300m)*0.6*0.5
        (maxMemory * conf.getDouble("spark.memory.storageFraction", 0.5)).toLong,
      numCores = numCores)
  }

Spark统一内存管理