private def persist(newLevel: StorageLevel, allowOverride: Boolean): this.type = { 。。。。。。
if (storageLevel == StorageLevel.NONE) {
sc.cleaner.foreach(_.registerRDDForCleanup(this)) //清理缓存
sc.persistRDD(this) //注册和跟踪已经被缓存的rdd
}
storageLevel = newLevel
this
}
第一次调用RDD的persist方法时,使用ContextCleaner类的registerRDDForCleanup清理缓存。SparkContext的persistRDD方法将(rdd.id,rdd)加载到persistentRdds(是一个HashMap)中,key是rdd.id,value是一个包含时间戳的rdd引用,persistentRdds用来跟踪已经被标记为persisit的RDD引用的。
rdd的存储缓存阶段:
当执行rdd的某个action操作时,之后调用DAGScheduler.submitJob来提交job,完成stage的ShuffleMapTask或ResultTask触发才真正进入存储缓存阶段,当调用ShuffleMapTask/ResultTask的runTask方法时,在runTask方法内调用的RDD的iterator方法,RDD的iterator方法的源码:
final def iterator(split: Partition, context: TaskContext): Iterator[T] = {
if (storageLevel != StorageLevel.NONE) {
SparkEnv.get.cacheManager.getOrCompute(this, split, context, storageLevel)
} else {
computeOrReadCheckpoint(split, context)
}}
当存储级别StorageLevel不为NONE时,说明之前的已经被cache了。因此当rdd的存储几倍StorageLevel不为NONE时,直接去CacheManager中获取(getOrCompute),否则自己compute或从checkPoint读取(RDD.computeOrReadCheckpoint)。CacheManager类调用getOrCompute方法的源码:
def getOrCompute[T](rdd: RDD[T],partition: Partition,context: TaskContext,storageLevel: StorageLevel): Iterator[T] = {
val key = RDDBlockId(rdd.id, partition.index) //获取rdd块数据的id号
blockManager.get(key) match { //从存储管理器BlockManager中匹配所需数据
case Some(blockResult) =>
val existingMetrics = context.taskMetrics
.getInputMetricsForReadMethod(blockResult.readMethod)
existingMetrics.incBytesRead(blockResult.bytes)
val iter = blockResult.data.asInstanceOf[Iterator[T]]
new InterruptibleIterator[T](context, iter) {
override def next(): T = {
existingMetrics.incRecordsRead(1)
delegate.next()
}
}
case None => //虽然rdd持久化过,但是得不到数据
val storedValues = acquireLockForPartition[T](key) //再调用BlockManager的get方法获取数据
if (storedValues.isDefined) {
return new InterruptibleIterator[T](context, storedValues.get)
}
try {
//若acquireLockForPartition还是没有获取数据,若rdd已checkpoint过,则从checkpoint目录中获取;若rdd没有设置过checkpoint或从checkpoint目录中没有获取数据,则从父rdd中计算
val computedValues = rdd.computeOrReadCheckpoint(partition, context)
if (context.isRunningLocally) {
return computedValues
}
val updatedBlocks = new ArrayBuffer[(BlockId, BlockStatus)]
//由于某种原因数据丢失,将从checkpoint或重新计算的数据,持久化一份
val cachedValues = putInBlockManager(key, computedValues, storageLevel, updatedBlocks)
val metrics = context.taskMetrics
val lastUpdatedBlocks = metrics.updatedBlocks.getOrElse(Seq[(BlockId, BlockStatus)]())
metrics.updatedBlocks = Some(lastUpdatedBlocks ++ updatedBlocks.toSeq)
new InterruptibleIterator(context, cachedValues)
} finally {。。。。。。}
CacheManager的getOrCompute方法主要讲的是从BlockManager中获取已经持久化(cache)的数据,若真实存在已经持久化的rdd数据则直接返回(case(blockResult))。虽然已经持久化了rdd数据可能无法获取,则从checkpoint目录中或从父rdd中获取数据(computeOrReadCheckpoint),重新得到的数据进行持久化一份即可。RDD的computeOrReadCheckpoint方法的源码:
private[spark] def computeOrReadCheckpoint(split: Partition, context: TaskContext): Iterator[T] =
{
if (isCheckpointedAndMaterialized) {
firstParent[T].iterator(split, context)
} else {
compute(split, context)
}
}
computeOrReadCheckpoint方法:若能从checkpoint目录中获取数据则直接计算,若不能获取数据则遍历parent rdd返回结果传递给计算函数使其进行计算。
在CacheManager的getOrCompute中当从持久化中没有获取数据,需要从checkpoint目录或重新计算得到的结果需要重新持久化/缓存起来,调用CacheManager的putInBlockManager方法。CacheManager的putInBlockManager的源码:
private def putInBlockManager[T](key: BlockId,values: Iterator[T],level: StorageLevel,updatedBlocks: ArrayBuffer[(BlockId, BlockStatus)],effectiveStorageLevel: Option[StorageLevel] = None): Iterator[T] = {
val putLevel = effectiveStorageLevel.getOrElse(level)
if (!putLevel.useMemory) {
//此RDD不会被缓存到内存中,因此可直接将计算值作为迭代器传递给BlockManager,而不是首先在内存中获取
updatedBlocks ++=blockManager.putIterator(key, values, level, tellMaster = true, effectiveStorageLevel)
blockManager.get(key) match {
case Some(v) => v.data.asInstanceOf[Iterator[T]]
case None =>
throw new BlockException(key, s"Block manager failed to return cached value for $key!")
}
} else {
//此RDD将被缓存到内存中,不能将计算值作为迭代器传递给BlockManager,并稍后读取它,因为在回复分区之前,可能会从内存中删除分区
blockManager.memoryStore.unrollSafely(key, values, updatedBlocks) match {
case Left(arr) =>
//已获取整个分区,将其缓存到内存中
updatedBlocks ++= blockManager.putArray(key, arr, level, tellMaster = true, effectiveStorageLevel)
arr.iterator.asInstanceOf[Iterator[T]]
case Right(it) =>
val returnValues = it.asInstanceOf[Iterator[T]] //没有足够的空间将分区数据缓存到内存中
if (putLevel.useDisk) {
val diskOnlyLevel = StorageLevel(useDisk = true, useMemory = false,
useOffHeap = false, deserialized = false, putLevel.replication)
putInBlockManager[T](key, returnValues, level, updatedBlocks, Some(diskOnlyLevel))
} else {returnValues } } } }
CacheManager的putInBlockManager方法:当没有足够空间进行分区的情况下,将MEMORY_AND_DISK分区强制转换到磁盘。把checkpoint目录数据或父rdd数据按照StorageLevel重新进行缓存/持久化到BlockManager中。
CacheManager的putInBlockManager方法中调用BlockManager的putIterator方法用于把结果值作为迭代器存入BlockManager中,在putIterator方法中真正调用的是BlockManager的doPut方法,doPut方法是向BlockManager写入数据的体现。BlockManager的doPut方法的源码:
private def doPut(blockId: BlockId,data: BlockValues,level: StorageLevel,tellMaster: Boolean = true,effectiveStorageLevel: Option[StorageLevel] = None)
: Seq[(BlockId, BlockStatus)] = { 。。。。。。
val updatedBlocks = new ArrayBuffer[(BlockId, BlockStatus)]
val putBlockInfo = { //为要写入的block,创建一个BlockInfo
val tinfo = new BlockInfo(level, tellMaster)
val oldBlockOpt = blockInfo.putIfAbsent(blockId, tinfo)
if (oldBlockOpt.isDefined) {
if (oldBlockOpt.get.waitForReady()) {return updatedBlocks} oldBlockOpt.get} else { tinfo} }。。。。。。
putBlockInfo.synchronized { //对BlockInfo加锁,进行多线程并发同步访问
var marked = false
try { //根据持久化级别,选择存储级别BlockStore:MemoryStore、DiskStore和ExternalBlockStore
val (returnValues, blockStore: BlockStore) = {
if (putLevel.useMemory) { //MemoryStore级别
(true, memoryStore)
} else if (putLevel.useOffHeap) { //ExeternalBlockStore级别
(false, externalBlockStore)
} else if (putLevel.useDisk) { //DiskStore级别
(putLevel.replication > 1, diskStore)
} else {。。。。。。} }
val result = data match { //真正开始向BlockManager存入数据:IteraotorValues、ArrayValues、ByteBufferValues
case IteratorValues(iterator) =>
blockStore.putIterator(blockId, iterator, putLevel, returnValues)
case ArrayValues(array) =>
blockStore.putArray(blockId, array, putLevel, returnValues)
case ByteBufferValues(bytes) =>
bytes.rewind()
blockStore.putBytes(blockId, bytes, putLevel)
} 。。。。。。
if (putLevel.useMemory) { //跟踪从memory删除的block
result.droppedBlocks.foreach { updatedBlocks += _ }
val putBlockStatus = getCurrentBlockStatus(blockId, putBlockInfo) //获取block对应的blockStatus
if (putBlockStatus.storageLevel != StorageLevel.NONE) { 。。。。。。
//将新写入的block数据,发送给BlockManagerMasterEndpoint进行block元数据同步和维护
reportBlockStatus(blockId, putBlockInfo, putBlockStatus)
}
updatedBlocks += ((blockId, putBlockStatus))
} } finally { 。。。。。。
if (putLevel.replication > 1) { //持久化级别定义为_2,需将block数据复制到其他节点上备份
data match {。。。。。。 //replicate方法进行数据备份
replicate(blockId, bytesAfterPut, putLevel)
}
}
BlockManager.dispose(bytesAfterPut) 。。。。。。
}
CacheManager的putInBlockManager方法调用BlockManager的get方法来获取rdd数据。BlockManager的get方法源码:
def get(blockId: BlockId): Option[BlockResult] = {
val local = getLocal(blockId) //优先获取本地的数据
if (local.isDefined) {
return local
}
val remote = getRemote(blockId) //从本地获取数据后再远程获取数据
if (remote.isDefined) {
return remote
} None}