一、心跳连接
Spark Standalone 是经典的 Master/Slave 结构,Slave 就是集群中的 Worker,Worker 启动后,会向 Master 注册,注册成功后会定时向 Master 发送 心跳,上报自己的状态。同时 Master 也会不断检查注册的 Worker 是否超时。
首先,Worker 发送心跳信息:
Worker # handleRegisterResponse:
case RegisteredWorker(masterRef, masterWebUiUrl, masterAddress) =>
...
forwordMessageScheduler.scheduleAtFixedRate(new Runnable {
override def run(): Unit = Utils.tryLogNonFatalError {
self.send(SendHeartbeat)
}
}, 0, HEARTBEAT_MILLIS, TimeUnit.MILLISECONDS)
...
Worker # receive:
case SendHeartbeat =>
if (connected) { sendToMaster(Heartbeat(workerId, self)) }
Master 接收心跳,修改上次心跳的时间。
Master # receive:
case Heartbeat(workerId, worker) =>
idToWorker.get(workerId) match {
case Some(workerInfo) =>
workerInfo.lastHeartbeat = System.currentTimeMillis()
case None =>
if (workers.map(_.id).contains(workerId)) {
logWarning(s"Got heartbeat from unregistered worker $workerId." +
" Asking it to