BroadcastManager用于将配置信息序列化后的RDD、Job及ShuffleDependency等信息在本地存储,如果为了容灾,也会复制到其他节点上。创建BroadcastManager的代码实现如下:
//org.apache.spark.SparkEnv
val broadcastManager = new BroadcastManager(isDriver, conf, securityManager)
BroadcastManager除了构造器定义的三个成员属性外,BroadcastManager内部还有三个成员:
//表示BroadcastManager是否初始化完成的状态
private var initialized = false
//广播工厂实例
private var broadcastFactory: BroadcastFactory = null
//下一个广播对象的广播ID,类型为AtomicLong
private val nextBroadcastId = new AtomicLong(0)
BroadcastManager在其初始化的过程中就会调用自身的initialize方法,当initialize执行完毕,BroadcastManager就正式生效。BraodcastManager的initialize方法的实现如下:
//org.apache.spark.broadcast.BroadcastManager
private def initialize() {
synchronized {
if (!initialized) {
broadcastFactory = new TorrentBroadcastFactory
broadcastFactory.initialize(isDriver, conf, securityManager)
initialized = true
}
}
}
根据代码,initialize方法首先判断BroadcastManager是否已经初始化,以保证BroadcastManager只被初始化一次。新建TorrentBroadcastFactory作为BroadcastManager的广播工厂实例,之后调用TorrentBroadcastFactory的initialize方法对TorrentBroadcastFactory进行初始化,最后将BroadcastManager自身标记为初始化完成状态。
BroadcastManager中提供了三个方法,如下:
//org.apache.spark.broadcast.BroadcastManager
def stop() {
broadcastFactory.stop()
}
private val nextBroadcastId = new AtomicLong(0)
def newBroadcast[T: ClassTag](value_ : T, isLocal: Boolean): Broadcast[T] = {
broadcastFactory.newBroadcast[T](value_, isLocal, nextBroadcastId.getAndIncrement())
}
def unbroadcast(id: Long, removeFromDriver: Boolean, blocking: Boolean) {
broadcastFactory.unbroadcast(id, removeFromDriver, blocking)
}
从代码中可看到,BroadcastManager的三个方法都分别代理了TorrentBroadcastFactory的对应方法,TorrentBroadcastFactory中提供的三个方法的实现如下:
//org.apache.spark.broadcast.TorrentBroadcastFactory
override def newBroadcast[T: ClassTag](value_ : T, isLocal: Boolean, id: Long): Broadcast[T] = {
new TorrentBroadcast[T](value_, id)
}
override def stop() { }
override def unbroadcast(id: Long, removeFromDriver: Boolean, blocking: Boolean) {
TorrentBroadcast.unpersist(id, removeFromDriver, blocking)
}
由代码可知TorrentBroadcastFactory的newBroadcast方法用于生成TorrentBroadcast实例,其作用为广播TorrentBroadcast中的value。表面看只是利用构造器生成了TorrentBroadcast实例,但是其效果远不止此。TorrentBroadcast对象包括以下属性
//org.apache.spark.broadcast.TorrentBroadcast
@transient private lazy val _value: T = readBroadcastBlock()
@transient private var compressionCodec: Option[CompressionCodec] = _
@transient private var blockSize: Int = _
private val broadcastId = BroadcastBlockId(id)
private val numBlocks: Int = writeBlocks(obj)
- _value:从Executor或者Driver上读取的广播块的值。_value是通过调用readBroadcastBlock方法获得的广播对象,由于_value是个lazy及val修饰的属性,因此在构造TorrentBroadcast实例的时候不会调用readBroadcastBlock方法,而是等到明确需要使用_value的值时才调用。
- compressionCodec:用于广播对象的压缩编解码器。可以设置spark.broadcast.compress属性为true启用,默认是启用的。
- blockSize:每个块的大小。它是个只读属性,可以使用spark.broadcast.bockSize属性进行配置,默认为4MB。
- broadcastId:广播ID,实际是样例类BroadcastBlockId,其代码为:
//org.apache.spark.storage.BlockId
case class BroadcastBlockId(broadcastId: Long, field: String = "") extends BlockId {
override def name: String = "broadcast_" + broadcastId + (if (field == "") "" else "_" + field)
}
- numBlocks:广播变量包含的块的数量。numBlocks通过调用writeBlocks方法获得,由于numBlocks是个val修饰的不可变量属性,因此在构造TorrentBroadcast实例的时候就会调用writeBlock方法将广播对象写入存储体系。
1 广播对象的写操作
上面代码中提到在构造TorrentBroadcast实例的时候就会调用writeBlocks方法,其实现代码如下
private def writeBlocks(value: T): Int = {
import StorageLevel._
val blockManager = SparkEnv.get.blockManager
if (!blockManager.putSingle(broadcastId, value, MEMORY_AND_DISK, tellMaster = false)) {
throw new SparkException(s"Failed to store $broadcastId in BlockManager")
}
val blocks =
TorrentBroadcast.blockifyObject(value, blockSize, SparkEnv.get.serializer, compressionCodec)
blocks.zipWithIndex.foreach { case (block, i) =>
val pieceId = BroadcastBlockId(id, "piece" + i)
val bytes = new ChunkedByteBuffer(block.duplicate())
if (!blockManager.putBytes(pieceId, bytes, MEMORY_AND_DISK_SER, tellMaster = true)) {
throw new SparkException(s"Failed to store $pieceId of $broadcastId in local BlockManager")
}
}
blocks.length
}
上述代码writeBlocks的执行步骤:
- 1)获取当前SparkEnv的BlockManager组件
- 2)调用BlockManager的putSingle方法将广播对象写入本地的存储体系。当Spark以local模式运行时,则会将广播对象写Driver本地的存储体系,以便于任务也可以在Driver上执行。由于MEMORY_AND_DISK对应的StorageLevel的_replication属性固定为1,因此此处只会将广播对象写Driver或Executor本地的存储体系。
- 3)调用TorrentBroadcast的blockifyObject方法,将对象转换成一系列的块。每个块的大小由blockSize决定,使用当前SparkEnv中的JavaSerializer组件进行序列化,使用TorrentBroadcast自身的compressionCodec进行压缩
- 4)对每个块进行如下处理:给当前分片广播块生成分片的BroadcastBlockId,分片通过BroadcastBlockId的field属性区别,例如piece0、piece1......调用BlockManager的putBytes方法将分片广播块以序列化方式写入Driver本地的存储体系。由于 MEMORY_AND_DISK_SER对应的StorageLevel的_replication属性也固定为1,因此此处只会将分片广播块写入Driver或Excutor本地的存储体系。
- 5)返回块的数量
经过以上分析,可用下图表示广播对象的写入过程:
2 广播对象的读操作
前文提到,只有当TorrentBroadcast实例的_value属性值在需要的时候,才会调用readBroadcastBlock方法获取值,readBroadcastBlock的实现代码如下:
//org.apache.spark.broadcast.TorrentBroadcast
private def readBroadcastBlock(): T = Utils.tryOrIOException {
TorrentBroadcast.synchronized {
setConf(SparkEnv.get.conf)
val blockManager = SparkEnv.get.blockManager
blockManager.getLocalValues(broadcastId).map(_.data.next()) match {
case Some(x) =>
releaseLock(broadcastId)
x.asInstanceOf[T]
case None =>
logInfo("Started reading broadcast variable " + id)
val startTimeMs = System.currentTimeMillis()
val blocks = readBlocks().flatMap(_.getChunks())
logInfo("Reading broadcast variable " + id + " took" + Utils.getUsedTimeMs(startTimeMs))
val obj = TorrentBroadcast.unBlockifyObject[T](
blocks, SparkEnv.get.serializer, compressionCodec)
val storageLevel = StorageLevel.MEMORY_AND_DISK
if (!blockManager.putSingle(broadcastId, obj, storageLevel, tellMaster = false)) {
throw new SparkException(s"Failed to store $broadcastId in BlockManager")
}
obj
}
}
}
根据上述代码readBroadcastBlock的执行步骤如下:
- 1)获取当前SparkEnv的BlockManager组件
- 2)调用BlockManager的getLocalValues方法从本地的存储系统中获取广播对象,即通过BlockManager的putSingle方法写入存储体系的广播对象
- 3)如果从本地的存储体系中可以获取广播对象,则调用releaseLock方法(这个锁保证当块被一个运行中的任务使用时,不能被其它任务再次使用,但是当任务运行完成时,则应该释放这个锁),释放当前块的锁并返回此广播对象
- 4)如果从本地的存储体系中没有获取到广播对象,那么说明数据是通过BlockManager的putBytes方法以序列化方式写入存储体系的。此时首先调用readBlocks方法从Driver或Executor的存储体系中获取广播块,然后调用TorrentBroadcast的unBlockifyObject方法,将一系列的广播块转换回原来的广播对象,最后再次调用BlockManager的putSingle方法将广播对象写入本地的存储体系,以便于当前Executor的其它任务不用再次获取广播对象
上文的代码中调用readBlocks方法可以从Driver、Executor的存储体系中获取块,其实现代码如下:
//org.apache.spark.broadcast.TorrentBroadcast
private def readBlocks(): Array[ChunkedByteBuffer] = {
val blocks = new Array[ChunkedByteBuffer](numBlocks)
val bm = SparkEnv.get.blockManager
for (pid <- Random.shuffle(Seq.range(0, numBlocks))) {
val pieceId = BroadcastBlockId(id, "piece" + pid)
logDebug(s"Reading piece $pieceId of $broadcastId")
bm.getLocalBytes(pieceId) match {
case Some(block) =>
blocks(pid) = block
releaseLock(pieceId)
case None =>
bm.getRemoteBytes(pieceId) match {
case Some(b) =>
if (!bm.putBytes(pieceId, b, StorageLevel.MEMORY_AND_DISK_SER, tellMaster = true)) {
throw new SparkException(
s"Failed to store $pieceId of $broadcastId in local BlockManager")
}
blocks(pid) = b
case None =>
throw new SparkException(s"Failed to get $pieceId of $broadcastId")
}
}
}
blocks
}
readBroadcastBlock的执行步骤如下:
- 1)新建用于存储每个分片广播块的数组blocks,并获取当前SparkEnv的BlockManager组件
- 2)对各个广播分片进行随机洗牌,避免对广播块的获取出现“热点”,提升性能。对洗牌后的各个广播分片依次执行3至4步的操作
- 3)调用BlockManager的getLocalBytes方法从本地的存储体系中获取序列化的分片广播块,如果本地可以获取到,则将分片广播放入blocks,并且调用releaseLock方法释放此分片广播块的锁。
- 4)如果本地没有,则调用BlockManager的getRemoteBytes方法从远端的存储体系中获取分片广播块,然后调用BlockManager的putBytes方法将分片广播块写入本地存储体系,以便于当前Executor的其它任务不用再次获取分片广播块,最后将分片广播块放入blocks。
- 5)返回blocks中的所有分片广播块
3 广播对象的去持久化
//org.apache.spark.broadcast.TorrentBroadcast
def unpersist(id: Long, removeFromDriver: Boolean, blocking: Boolean): Unit = {
logDebug(s"Unpersisting TorrentBroadcast $id")
SparkEnv.get.blockManager.master.removeBroadcast(id, removeFromDriver, blocking)
}
根据上述代码可知TorrentBroadcast的unpersist方法实际调用了BlockManager的子组件BlockManagerMaster的removeBroadcast方法来实现对广播对象去持久化。