Spark MemoryManager
1.MemoryManager接口
1.1.概述
在Spark中,MemoryManager接口定义了Storage内存和Execution内存统一管理分配的公共方法。包括堆内以及堆外内存。
1.2.相关成员
// 堆内Storage内存池
protected val onHeapStorageMemoryPool = new StorageMemoryPool(this, MemoryMode.ON_HEAP)
// 堆外Storage内存池
protected val offHeapStorageMemoryPool = new StorageMemoryPool(this, MemoryMode.OFF_HEAP)
// 堆内Execution内存池
protected val onHeapExecutionMemoryPool = new ExecutionMemoryPool(this, MemoryMode.ON_HEAP)
// 堆外Storage内存池
protected val offHeapExecutionMemoryPool = new ExecutionMemoryPool(this, MemoryMode.OFF_HEAP)
// 设置 onHeapStorageMemoryPool 大小为 onHeapStorageMemory
onHeapStorageMemoryPool.incrementPoolSize(onHeapStorageMemory)
// 设置 onHeapExecutionMemoryPool 大小为 onHeapExecutionMemory
onHeapExecutionMemoryPool.incrementPoolSize(onHeapExecutionMemory)
// 获取参数 spark.memory.offHeap.size 的值,即设置的堆外内存大小(默认值为0)
protected[this] val maxOffHeapMemory = conf.get(MEMORY_OFFHEAP_SIZE)
// 获取参数 spark.memory.storageFraction 的比值,即堆外内存中 Stroage 内存堆占比
protected[this] val offHeapStorageMemory =
(maxOffHeapMemory * conf.get(MEMORY_STORAGE_FRACTION)).toLong
// 设置堆外 offHeapExecutionMemoryPool 大小为 最大堆外内存 - 堆外 StorageMemory
offHeapExecutionMemoryPool.incrementPoolSize(maxOffHeapMemory - offHeapStorageMemory)
// 设置堆外 storageMemoryPool 大小为 offHeapStorageMemory
offHeapStorageMemoryPool.incrementPoolSize(offHeapStorageMemory)
至于堆内内存onHeapStorageMemory和onHeapExecutionMemory这两个参数的大小值,与其具体实现MemoryManager的实现类有关系。可参见接下来的UnifiedMemoryManager实现。
其中,参数spark.memory.offHeap.enabled用来制定是否使用堆外内存,默认是false,即不开启。
1.3.内存池MemoryPool接口
该接口定义了内存池的相关公共方法,Storage内存池StorageMemoryPool和Execution内存池ExecutionMemoryPool都继承自该接口。
该接口主要方法:
// 返回内存池大小
final def poolSize: Long
// 返回可用内存池大小
final def memoryFree: Long
// 扩大内存池大小
final def incrementPoolSize(delta: Long): Unit
// 缩小内存池大小
final def decrementPoolSize(delta: Long): Unit
// 返回内存池当前的使用量
def memoryUsed: Long
1.4.内存管理相关接口
// 申请Storage内存
def