Spark Streaming源码解读之Executor容错安全性

最新推荐文章于 2024-07-11 14:13:01 发布

weixin_34220179

最新推荐文章于 2024-07-11 14:13:01 发布

阅读量91

点赞数

文章标签： scala 大数据 python

原文链接：https://my.oschina.net/corleone/blog/679390

版权

2019独角兽企业重金招聘Python工程师标准>>>

前文可知，streaming 会持续不断的接收数据，同时不断的生成job，不断的提交job给集群运行。

而安全性有两部分，数据的容错和运行时计算的容错。

众所周知，Streaming是运行在Spark Core之上的，而Spark Core中RDD是有非常强悍的容错机制的，Streaming借助Spark Core 可以保证运行时的安全性。

那么有一个前提，必须在数据安全的前提下，运行时计算的容错机制才可以发挥作用，那么是如何确保数据的安全性的呢？也就是说，对executor接收数据的安全容错，主要是指接收到的数据的安全容错。至于在数据安全容错的基础之上的调度的安全容错，基本是借助spark core，当然spark streaming框架自己的driver也有一部分。

Executor的WAL
数据重放

关于数据容错，大家能想到的最简单的方式是，在接收到数据时，自动保存若干个副本，在当前副本有问题时，使用另外一个副本。

另一种方式就是数据源支持重放，所谓重放就是反复的可取已有的数据。比如过去10秒钟的数据读取后再计算阶段出现问题，可以重新读取这十秒钟的数据。

还是以SocketReceiver为例：

// SocketInputDStream.scala line 76
      while(!isStopped && iterator.hasNext) {
        store(iterator.next)
      }

调用supervisor的pushSingle

// Receiver.scala line 118
  def store(dataItem: T) {
    supervisor.pushSingle(dataItem)
  }

调用blockGenerator.

// ReceiverSupervisorImpl.scala line 118
  def pushSingle(data: Any) {
    defaultBlockGenerator.addData(data)
  }

将数据加入到当前缓存中

// BlockGenerator.scala line 160 spark 1.6.0
  def addData(data: Any): Unit = {
    if (state == Active) {
      waitToPush()
      synchronized {
        if (state == Active) {
          currentBuffer += data
        } else {
          throw new SparkException(
            "Cannot add data as BlockGenerator has not been started or has been stopped")
        }
      }
    } else {
      throw new SparkException(
        "Cannot add data as BlockGenerator has not been started or has been stopped")
    }
  }

而于此同时，BlockGenerator实例化时的定时器定时将缓存中的数据加入待push队列中[blocksForPushing]。

// BlockGenerator.scala line 234
var newBlock: Block = null
      synchronized {
        if (currentBuffer.nonEmpty) {
          val newBlockBuffer = currentBuffer
          currentBuffer = new ArrayBuffer[Any]
          val blockId = StreamBlockId(receiverId, time - blockIntervalMs)
          listener.onGenerateBlock(blockId)
          newBlock = new Block(blockId, newBlockBuffer)
        }
      }

      if (newBlock != null) {
// 加入到待推送队列中，此队列为阻塞队列，使用put方法会队列大小超过指定数量时阻塞。
        blocksForPushing.put(newBlock)  // put is blocking when queue is full
      }

同时，BlockGenerator实例化时的线程通知监听器将队列中的数据推给BlockManager

// BlockGenerator.scala line 109
private val blockPushingThread = new Thread() { override def run() { keepPushingBlocks() } }

// BlockGenerator.scala line 166
      while (areBlocksBeingGenerated) {
        Option(blocksForPushing.poll(10, TimeUnit.MILLISECONDS)) match {
          case Some(block) => pushBlock(block)
          case None =>
        }
      }

// BlockGenerator.scala line 295
  private def pushBlock(block: Block) {
    listener.onPushBlock(block.id, block.buffer)
    logInfo("Pushed block " + block.id)
  }

// ReceiverSupervisorImpl.scala line 108
    def onPushBlock(blockId: StreamBlockId, arrayBuffer: ArrayBuffer[_]) {
      pushArrayBuffer(arrayBuffer, None, Some(blockId))
    }

// ReceiverSupervisorImpl.scala line 123
  def pushArrayBuffer(
      arrayBuffer: ArrayBuffer[_],
      metadataOption: Option[Any],
      blockIdOption: Option[StreamBlockId]
    ) {
    pushAndReportBlock(ArrayBufferBlock(arrayBuffer), metadataOption, blockIdOption)
  }

按照不同的receivedBlockHandler实现，取不同的实例。假设时WAL的实现。

// ReceiverSupervisorImpl.scala line 150
  def pushAndReportBlock(
      receivedBlock: ReceivedBlock,
      metadataOption: Option[Any],
      blockIdOption: Option[StreamBlockId]
    ) {
    val blockId = blockIdOption.getOrElse(nextBlockId)
    val time = System.currentTimeMillis
    val blockStoreResult = receivedBlockHandler.storeBlock(blockId, receivedBlock)
    logDebug(s"Pushed block $blockId in ${(System.currentTimeMillis - time)} ms")
    val numRecords = blockStoreResult.numRecords
    val blockInfo = ReceivedBlockInfo(streamId, numRecords, metadataOption, blockStoreResult)
// 从远程worker节点将存储的Block元数据通过rpc通信通知Driver
    trackerEndpoint.askWithRetry[Boolean](AddBlock(blockInfo))
    logDebug(s"Reported block $blockId")
  }

// ReceiverSupervisorImpl.scala line 53
  private val receivedBlockHandler: ReceivedBlockHandler = {
    if (WriteAheadLogUtils.enableReceiverLog(env.conf)) {
      if (checkpointDirOption.isEmpty) {
        throw new SparkException(
          "Cannot enable receiver write-ahead log without checkpoint directory set. " +
            "Please use streamingContext.checkpoint() to set the checkpoint directory. " +
            "See documentation for more details.")
      }
      new WriteAheadLogBasedBlockHandler(env.blockManager, receiver.streamId,
        receiver.storageLevel, env.conf, hadoopConf, checkpointDirOption.get)
    } else {
      new BlockManagerBasedBlockHandler(env.blockManager, receiver.storageLevel)
    }
  }

WriteAheadLogBasedBlockHandler的实现。

大家可以看到，再本案例中，既将数据存入BlockManager，同时又写入WAL，而且WAL默认是 MEMORY_AND_DISK_SER_2

最后返回BlockStoreResult，里面是 Block的元数据

// WriteAheadLogBasedBlockHandler.scala line 166
  def storeBlock(blockId: StreamBlockId, block: ReceivedBlock): ReceivedBlockStoreResult = {

    var numRecords = None: Option[Long]
    // Serialize the block so that it can be inserted into both
    val serializedBlock = block match {
// 此处是ArrayBufferBlock ，见ReceiverSupervisorImpl.scala line 123
      case ArrayBufferBlock(arrayBuffer) => 
        numRecords = Some(arrayBuffer.size.toLong)
// spark core的BlockManager序列化
        blockManager.dataSerialize(blockId, arrayBuffer.iterator)
      case IteratorBlock(iterator) =>
        val countIterator = new CountingIterator(iterator)
        val serializedBlock = blockManager.dataSerialize(blockId, countIterator)
        numRecords = countIterator.count
        serializedBlock
      case ByteBufferBlock(byteBuffer) =>
        byteBuffer
      case _ =>
        throw new Exception(s"Could not push $blockId to block manager, unexpected block type")
    }

    // Store the block in block manager，Spark Core的BlockManager，store block
    val storeInBlockManagerFuture = Future {
      val putResult =
        blockManager.putBytes(blockId, serializedBlock, effectiveStorageLevel, tellMaster = true)
      if (!putResult.map { _._1 }.contains(blockId)) {
        throw new SparkException(
          s"Could not store $blockId to block manager with storage level $storageLevel")
      }
    }

    // Store the block in write ahead log，写入wal，写入的是数据本身
    val storeInWriteAheadLogFuture = Future {
      writeAheadLog.write(serializedBlock, clock.getTimeMillis())
    }

    // Combine the futures, wait for both to complete, and return the write ahead log record handle
    val combinedFuture = storeInBlockManagerFuture.zip(storeInWriteAheadLogFuture).map(_._2)
    val walRecordHandle = Await.result(combinedFuture, blockStoreTimeout)
    WriteAheadLogBasedStoreResult(blockId, numRecords, walRecordHandle)
  }

ReceiverTracker接收到AddBlock的消息，使用driver

// ReceiverTracker.scala line 495
      case AddBlock(receivedBlockInfo) =>
        if (WriteAheadLogUtils.isBatchingEnabled(ssc.conf, isDriver = true)) {
          walBatchingThreadPool.execute(new Runnable {
            override def run(): Unit = Utils.tryLogNonFatalError {
              if (active) {
// 异步添加 receivedBlockInfo的元数据
                context.reply(addBlock(receivedBlockInfo))
              } else {
                throw new IllegalStateException("ReceiverTracker RpcEndpoint shut down.")
              }
            }
          })
        } else {
          context.reply(addBlock(receivedBlockInfo))
        }

// ReceiverTracker.scala line 320
  private def addBlock(receivedBlockInfo: ReceivedBlockInfo): Boolean = {
    receivedBlockTracker.addBlock(receivedBlockInfo)
  }

// ReceivedBlockTracker.scala line 84
  /** Add received block. This event will get written to the write ahead log (if enabled). */
  def addBlock(receivedBlockInfo: ReceivedBlockInfo): Boolean = {
    try {
// 元数据写入wal
      val writeResult = writeToLog(BlockAdditionEvent(receivedBlockInfo))
      if (writeResult) {
        synchronized {
          getReceivedBlockQueue(receivedBlockInfo.streamId) += receivedBlockInfo
        }
        logDebug(s"Stream ${receivedBlockInfo.streamId} received " +
          s"block ${receivedBlockInfo.blockStoreResult.blockId}")
      } else {
        logDebug(s"Failed to acknowledge stream ${receivedBlockInfo.streamId} receiving " +
          s"block ${receivedBlockInfo.blockStoreResult.blockId} in the Write Ahead Log.")
      }
      writeResult
    } catch {
      case NonFatal(e) =>
        logError(s"Error adding block $receivedBlockInfo", e)
        false
    }
  }

而当需要恢复的时候，会按照是否有checkpoint，直接从Driver恢复出状态

//
  private def recoverPastEvents(): Unit = synchronized {
    // Insert the recovered block information
    def insertAddedBlock(receivedBlockInfo: ReceivedBlockInfo) {
      logTrace(s"Recovery: Inserting added block $receivedBlockInfo")
      receivedBlockInfo.setBlockIdInvalid()
      getReceivedBlockQueue(receivedBlockInfo.streamId) += receivedBlockInfo
    }

    // Insert the recovered block-to-batch allocations and clear the queue of received blocks
    // (when the blocks were originally allocated to the batch, the queue must have been cleared).
    def insertAllocatedBatch(batchTime: Time, allocatedBlocks: AllocatedBlocks) {
      logTrace(s"Recovery: Inserting allocated batch for time $batchTime to " +
        s"${allocatedBlocks.streamIdToAllocatedBlocks}")
      streamIdToUnallocatedBlockQueues.values.foreach { _.clear() }
      timeToAllocatedBlocks.put(batchTime, allocatedBlocks)
      lastAllocatedBatchTime = batchTime
    }

    // Cleanup the batch allocations
    def cleanupBatches(batchTimes: Seq[Time]) {
      logTrace(s"Recovery: Cleaning up batches $batchTimes")
      timeToAllocatedBlocks --= batchTimes
    }

    writeAheadLogOption.foreach { writeAheadLog =>
      logInfo(s"Recovering from write ahead logs in ${checkpointDirOption.get}")
      writeAheadLog.readAll().asScala.foreach { byteBuffer =>
        logTrace("Recovering record " + byteBuffer)
        Utils.deserialize[ReceivedBlockTrackerLogEvent](
          byteBuffer.array, Thread.currentThread().getContextClassLoader) match {
          case BlockAdditionEvent(receivedBlockInfo) =>
            insertAddedBlock(receivedBlockInfo)
          case BatchAllocationEvent(time, allocatedBlocks) =>
            insertAllocatedBatch(time, allocatedBlocks)
          case BatchCleanupEvent(batchTimes) =>
            cleanupBatches(batchTimes)
        }
      }
    }
  }

上述为WAL的方式。而BlockManagerBasedBlockHandler则完全是依赖于BlockManager。

而另外一种方式就是数据重放，这个目前只有与kafka配合，在这个场景中，kafka已经不仅仅是消息队列了，更多的承担着数据存储的功能，而天然的副本也节约了streaming 数据接收阶段的容错开销。
只要streaming接收到的数据未向kafka 发起ack请求，则kafka会认为此消息未被消费。若数据处理中遭遇失败，则恢复后依然可以从未ack的数据开始。

转载于:https://my.oschina.net/corleone/blog/679390