Spark Master Worker 进程通讯
1、项目意义
1. 深入理解Spark的Master和Worker的通讯机制;
2. 命名的方式和源码保持一致.(如: 通讯消息类命名就是一样的);
3. 加深对主从服务心跳检测机制(HeartBeat)的理解,方便以后spark源码二次开发。
2、项目需求分析
1. worker注册到Master, Master完成注册,并回复worker注册成功;
2. worker定时发送心跳,并在Master接收到worker的心跳;
3. Master接收到worker心跳后,要更新该worker的最近一次发送心跳的时间;
4. 给Master启动定时任务,定时检测注册的worker有哪些没有更新心跳,并将其从hashmap中删除;
5. master worker 进行分布式部署(Linux系统) -> 如何给maven项目打包 ->上传linux。
3、程序框架图
示例代码
Master示例代码:
package com.lj.akka.spark.master
import akka.actor.{Actor, ActorSystem, Props}
import com.lj.akka.spark.common._
import com.typesafe.config.ConfigFactory
import scala.collection.mutable
import scala.concurrent.duration.DurationDouble
/**
* @author Administrator
* @create 2020-03-24
*/
class SparkMaster extends Actor {
// 收集Worker信息放到HashMap中
val workers = mutable.Map[String, WorkerInfo]()
override def receive: Receive = {
case "spark master start" => {
println("spark master 已经启动......")
// 在Master启动的时候就开始检测Worker的心跳,超时就删除
self ! StartTimeOutWorker
}
// 接收worker的注册信息
case RegisterWorkerInfo(worker_id, worker_cpu, worker_ram) => {
println("Master接收到Worker的注册信息......")
// 如果workers中不含有worker信息,需要将worker信息加入到workers中
if (!workers.contains(worker_id)) {
// 创建WorkerInfo对象
val workerInfo = new WorkerInfo(worker_id, worker_cpu, worker_ram)
workers += ((worker_id, workerInfo))
println("Master的workers:" + workers)
// 回复worker信息,通知worker注册成功
sender() ! RegisteredWorkerInfo
}
}
// 接收到Worker的心跳
case HeartBeat(worker_id) => {
// println(s"Master接收到Worker id:${worker_id} 的心跳...")
// 更新对应work的心跳时间
// 1. 从workers中取出WorkerInfo
val workerInfo = workers(worker_id)
workerInfo.lastHeartBeatTime = System.currentTimeMillis()
println(s"Master接收到Worker id:${worker_id} 的心跳,同时更新了Worker的心跳时间...")
}
// 触发master的检测心跳机制
case StartTimeOutWorker => {
import context.dispatcher
context.system.scheduler.schedule(0 millis, 9000 millis, self, RemoveTimeOutWorker)
}
case RemoveTimeOutWorker => {
// 1. 首先获取所有的workers的WorkerInfo对象
val workerInfos = workers.values
// 2. 获取当前时间
val now_time = System.currentTimeMillis()
// 3. 删除所有超时的worker
workerInfos.filter(workerInfo => (now_time - workerInfo.lastHeartBeatTime) > 6000)
.foreach(workerInfo => workers.remove(workerInfo.id))
println("当前拥有 " + workers.size + " 个存活的Worker!")
}
}
}
object SparkMaster {
def main(args: Array[String]): Unit = {
// val master_host = "127.0.0.1"
// val master_port = 13336
// val master_name = "SparkMaster01"
// 设置外部传参
if (args.length != 3) {
sys.exit()
}
val master_host = args(0)
val master_port = args(1)
val master_name = args(2)
// 第一步:创建config
val config = ConfigFactory.parseString(
s"""
|akka.actor.provider="akka.remote.RemoteActorRefProvider"
|akka.remote.netty.tcp.hostname=$master_host
|akka.remote.netty.tcp.port=$master_port
|akka.actor.warn-about-java-serializer-usage=false
""".stripMargin)
// 第二步:创建ActorSystem
val sparkMasterSystem = ActorSystem("SparkMasterSystem", config)
// 第三步:创建ActorRef
val sparkMasterActorRef = sparkMasterSystem.actorOf(Props[SparkMaster], master_name)
// 启动Actor
sparkMasterActorRef ! "spark master start"
}
}
Worker示例代码:
package com.lj.akka.spark.worker
import akka.actor.{Actor, ActorSelection, ActorSystem, Props}
import com.lj.akka.spark.common.{HeartBeat, RegisterWorkerInfo, RegisteredWorkerInfo, SendHeartBeat}
import com.typesafe.config.ConfigFactory
import scala.concurrent.duration.DurationDouble
/**
* @author Administrator
* @create 2020-03-24
*/
class SparkWorker(master_host: String, master_port: Int, master_actorRef_name: String) extends Actor {
var masterProxy: ActorSelection = _
val worker_id = java.util.UUID.randomUUID.toString
val worker_cpu = 16
val worker_ram = 128 * 1024
override def preStart(): Unit = {
masterProxy = context.actorSelection(
s"akka.tcp://SparkMasterSystem@${master_host}:${master_port}/user/${master_actorRef_name}")
}
override def receive: Receive = {
case "spark worker start" => {
println("spark worker 已经启动......")
// worker 启动以后就像Master注册信息
masterProxy ! RegisterWorkerInfo(worker_id, worker_cpu, worker_ram)
}
// 接收到Master回复Worker的消息
case RegisteredWorkerInfo => {
println(s"worker id:${worker_id} 到 Master 注册成功。")
// 当Worker注册成功后,就定义一个定时器,每隔一定时间,发送SendHeartBeat给自己
import context.dispatcher
/**
* 说明:
* 1. 0:表示立即执行,不延时
* 2. 3000:表示每隔3s执行一次
* 3. self:表示给自己发送消息
* 4. SendHeartBeat:表示发送的内容
*/
context.system.scheduler.schedule(0 millis, 3000 millis, self, SendHeartBeat)
}
// 接收到定时器定时发送给自己的消息
case SendHeartBeat => {
// 向Master发送心跳
println("Worker id:" + worker_id + " 向Master发送心跳...")
masterProxy ! HeartBeat(worker_id)
}
}
}
object SparkWorker {
def main(args: Array[String]): Unit = {
// val master_host = "127.0.0.1"
// val master_port = 13336
// val master_actorRef_name = "SparkMaster01"
//
// val worker_host = "127.0.0.1"
// val worker_port = 13331
// val worker_name = "SparkWorker01"
if (args.length != 6) {
sys.exit()
}
val master_host = args(0)
val master_port = args(1)
val master_actorRef_name = args(2)
val worker_host = args(3)
val worker_port = args(4)
val worker_name = args(5)
// 第一步:创建config
val config = ConfigFactory.parseString(
s"""
|akka.actor.provider="akka.remote.RemoteActorRefProvider"
|akka.remote.netty.tcp.hostname=$worker_host
|akka.remote.netty.tcp.port=$worker_port
|akka.actor.warn-about-java-serializer-usage=false
""".stripMargin)
// 第二步:创建ActorSystem
val sparkWorkerSystem = ActorSystem("SparkWorkerSystem", config )
// 第三步:创建ActorRef
val sparkWorkerActorRef = sparkWorkerSystem
.actorOf(Props(new SparkWorker(master_host, master_port.toInt, master_actorRef_name)), worker_name)
// 启动Actor
sparkWorkerActorRef ! "spark worker start"
}
}
Protocol示例代码:
package com.lj.akka.spark.common
/**
* @author Administrator
* @create 2020-03-24
*/
// worker向master发送注册信息
case class RegisterWorkerInfo (id: String, cpu: Int, ram: Int)
// 这个WorkerInfo是保存到master的hashmap中
class WorkerInfo(val id: String, val cpu: Int, val ram: Int) {
var lastHeartBeatTime: Long = System.currentTimeMillis()
}
// 当worker到master注册成功了,Master返回一个RegisteredWorkerInfo对象
case object RegisteredWorkerInfo
// Worker每隔一定时间由定时器发给自己一个消息
case object SendHeartBeat
// Worker每隔一定时间由定时器触发,向Master发送协议消息
case class HeartBeat(worker_id: String)
// Master给自己发送一个触发检查超时worker的信息
case object StartTimeOutWorker
// Master给自己发送消息,检测Worker对于心跳超时的进行移除
case object RemoveTimeOutWorker
运行效果图
4、给maven项目打包(SparkMaster与SparkWorker打包的方式一样)
第一步:在pom文件中指定运行的主类
第二步:打卡Maven功能 -> 双击package打包maven项目 -> 执行完以后会生成target目录在此文件夹下找到对应的Jar包
第三步运行Jar包(在Windows上测试 cmd -> java -jar Jar包 args参数)
对以前的知识回顾,加深基础知识!
学习来自:北京尚硅谷韩顺平老师—尚硅谷大数据技术之Scala
每天进步一点点,也许某一天你也会变得那么渺小!!!