Hive on Spark源码分析(五)—— RemoteDriver

Hive on Spark源码分析(一)—— SparkTask
Hive on Spark源码分析(二)—— SparkSession与HiveSparkClient
Hive on Spark源码分析(三)—— SparkClilent与SparkClientImpl(上)
Hive on Spark源码分析(四)—— SparkClilent与SparkClientImpl(下)
Hive on Spark源码分析(五)—— RemoteDriver
Hive on Spark源码分析(六)—— RemoteSparkJobMonitor与JobHandle 


RemoteDriver与SparkClient进行任务交互,并向Spark集群提交任务的。SparkClientImpl中通过调用RemoteDriver.main在新进程中启动了RemoteDriver

main函数

 
 
  1. public static void main(String[] args) throws Exception {
  2.    new RemoteDriver(args).run();
  3.  }


run方法里主要做关闭线程,删除临时目录的工作

 
 
  1. private void run() throws InterruptedException {
  2.    synchronized (shutdownLock) {
  3.      while (running) {
  4.        shutdownLock.wait();
  5.      }
  6.    }
  7.    executor.shutdownNow();
  8.    try {
  9.      FileUtils.deleteDirectory(localTmpDir);
  10.    } catch (IOException e) {
  11.      LOG.warn("Failed to delete local tmp dir: " + localTmpDir, e);
  12.    }
  13.  }


接下来我们看一下RemoteDriver的私有构造函数。处理参数,初始化环境变量,提交任务等都在私有工造方法中完成。首先是解析SparkClient传给RemoteDriver的参数,并付给相应的SparkConf:

  
  
  1.    SparkConf conf = new SparkConf();
  2.    String serverAddress = null;
  3.    int serverPort = -1;
  4.    for (int idx = 0; idx < args.length; idx += 2) {
  5.      String key = args[idx];
  6.      if (key.equals("--remote-host")) {
  7.        serverAddress = getArg(args, idx);
  8.      } else if (key.equals("--remote-port")) {
  9.        serverPort = Integer.parseInt(getArg(args, idx));
  10.      } else if (key.equals("--client-id")) {
  11.        conf.set(SparkClientFactory.CONF_CLIENT_ID, getArg(args, idx));
  12.      } else if (key.equals("--secret")) {
  13.        conf.set(SparkClientFactory.CONF_KEY_SECRET, getArg(args, idx));
  14.      } else if (key.equals("--conf")) {
  15.        String[] val = getArg(args, idx).split("[=]", 2);
  16.        conf.set(val[0], val[1]);
  17.      } else {
  18.        throw new IllegalArgumentException("Invalid command line: "
  19.          + Joiner.on(" ").join(args));
  20.      }
  21.    }

后面的内容我们使用代码注释来分析一下:
   
   
  1. //线程池,用于创建线程执行任务
  2. executor = Executors.newCachedThreadPool();
  3. LOG.info("Connecting to: {}:{}", serverAddress, serverPort);
  4. //将RemoteDriver使用的参数保存到mapConf中
  5. Map<String, String> mapConf = Maps.newHashMap();
  6. for (Tuple2<String, String> e : conf.getAll()) {
  7. mapConf.put(e._1(), e._2());
  8. LOG.debug("Remote Driver configured with: " + e._1() + "=" + e._2());
  9. }
  10. // 得到clientId和secret用于与rpcServer建立连接时的认证.认证基于sasl,这里不考虑细节.
  11. // sasl作为pipeline中的一个handler实现
  12. String clientId = mapConf.get(SparkClientFactory.CONF_CLIENT_ID);
  13. Preconditions.checkArgument(clientId != null, "No client ID provided.");
  14. String secret = mapConf.get(SparkClientFactory.CONF_KEY_SECRET);
  15. Preconditions.checkArgument(secret != null, "No secret provided.");
  16. //获取hive.spark.client.rpc.threads的值,如果没有设置则获取到默认值8
  17. int threadCount = new RpcConfiguration(mapConf).getRpcThreadCount();
  18. this.egroup = new NioEventLoopGroup(
  19. threadCount,
  20. new ThreadFactoryBuilder()
  21. .setNameFormat("Driver-RPC-Handler-%d")
  22. .setDaemon(true)
  23. .build());
  24. //protocol实际是一个handler,与ClientProtocol类似
  25. this.protocol = new DriverProtocol();
  26. // The RPC library takes care of timing out this.
  27. // 这里的createClient返回一个Promise<Rpc>,从Future的get方法返回返回Rpc类型的对象
  28. // 这里会创建Bootstrap,connect到rpcServer
  29. this.clientRpc = Rpc.createClient(mapConf, egroup, serverAddress, serverPort,
  30. clientId, secret, protocol).get();
  31. this.running = true;
这里createClient中连接到rpcServer时处理timing out的方式跟SparkClientImpl中处理超时的方法一样,只是设置超时的参数不同。
我们来看一下createClient的具体实现。首先是获得Rpc的相关配置,得到hive.spark.client.connect.timeout属性的值赋给connectTimeoutMs
    
    
  1. public static Promise<Rpc> createClient(
  2. Map<String, String> config,
  3. final NioEventLoopGroup eloop,
  4. String host,
  5. int port,
  6. final String clientId,
  7. final String secret,
  8. final RpcDispatcher dispatcher) throws Exception {
  9. final RpcConfiguration rpcConf = new RpcConfiguration(config);
  10. int connectTimeoutMs = (int) rpcConf.getConnectTimeoutMs();

然后连接serverBootstrap,在前面的文章中有分析过,在HiveClient的进程中会创建RpcServer,即创建ServerBootstrap,这里的Bootstrap与其相连
     
     
  1. final ChannelFuture cf = new Bootstrap()
  2. .group(eloop)
  3. .handler(new ChannelInboundHandlerAdapter() { })
  4. .channel(NioSocketChannel.class)
  5. .option(ChannelOption.SO_KEEPALIVE, true)
  6. .option(ChannelOption.CONNECT_TIMEOUT_MILLIS, connectTimeoutMs)
  7. .connect(host, port);

下面处理timing out的方式与之前分析过的registerClient的处理方式相同
      
      
  1. // Set up a timeout to undo everything.
  2. final Runnable timeoutTask = new Runnable() {
  3. @Override
  4. public void run() {
  5. promise.setFailure(new TimeoutException("Timed out waiting for RPC server connection."));
  6. }
  7. };
  8. final ScheduledFuture<?> timeoutFuture = eloop.schedule(timeoutTask,
  9. rpcConf.getServerConnectTimeoutMs(), TimeUnit.MILLISECONDS);

下面我们回到RemoteDriver。接下来要为clientRpc添加监听器,在rpc关闭时关闭RemoteDriver
    
    
  1. this.clientRpc.addListener(new Rpc.Listener() {
  2. @Override
  3. public void rpcClosed(Rpc rpc) {
  4. LOG.warn("Shutting down driver because RPC channel was closed.");
  5. shutdown(null);
  6. }
  7. });

创建SparkContext,实例化JobsContext保存job执行时的运行时信息
     
     
  1. try {
  2. JavaSparkContext sc = new JavaSparkContext(conf);
  3. sc.sc().addSparkListener(new ClientListener());
  4. synchronized (jcLock) {
  5. //initiallize job context which holds runtime information about the job execution context.
  6. jc = new JobContextImpl(sc, localTmpDir);
  7. jcLock.notifyAll();
  8. }
  9. } catch (Exception e) {
  10. LOG.error("Failed to start SparkContext: " + e, e);
  11. //初始化sc,jc过程中,如果抛出异常时关闭发送错误,关闭各种服务
  12. shutdown(e);
  13. synchronized (jcLock) {
  14. jcLock.notifyAll();
  15. }
  16. throw e;
  17. }

因为在SparkContext还没就绪的时候提交的任务会被保存在一个等待队列jobQueued中,因此最后一步是将等待队列中的任务全部提交:
    
    
  1. synchronized (jcLock) {
  2. //提交所有等待队列里的任务
  3. for (Iterator<JobWrapper<?>> it = jobQueue.iterator(); it.hasNext();) {
  4. it.next().submit();
  5. }
  6. }
  7. }

至此RemoteDriver的构造就完成了,我们分别看一下其中用到的shutdown和submit方法。
shutdown方法用于停止RemoteDriver进程,参数为Throwable,正常停止调用时会传入null,否则传入一个error。在执行关闭动作之前,首先判断当前RemoteDriver的状态是否是running的,如果不是running的则不需要做任何操作。
    
    
  1. private synchronized void shutdown(Throwable error) {
  2. if (running) {
  3. if (error == null) {
  4. LOG.info("Shutting down remote driver.");
  5. } else {
  6. LOG.error("Shutting down remote driver due to error: " + error, error);
  7. }
  8. running = false;
  9. for (JobWrapper<?> job : activeJobs.values()) {
  10. cancelJob(job);
  11. }
  12. if (error != null) {
  13. protocol.sendError(error);
  14. }
  15. if (jc != null) {
  16. jc.stop();
  17. }
  18. clientRpc.close();
  19. egroup.shutdownGracefully();
  20. synchronized (shutdownLock) {
  21. shutdownLock.notifyAll();
  22. }
  23. }
  24. }
代码很简单,主要操作就是取消残留在activeJobs列表中的残留任务,停止jobContext,关闭rpc,关闭NioEventLoop。

submit方法负责job的提交执行。如果提交的时候job context还没有就绪,则现将job加入到等待队列中,否则直接调用JobWrapper的submit方法提交执行
   
   
  1. private void submit(JobWrapper<?> job) {
  2. synchronized (jcLock) {
  3. if (jc != null) {
  4. job.submit();
  5. } else {
  6. LOG.info("SparkContext not yet up, queueing job request.");
  7. jobQueue.add(job);
  8. }
  9. }
  10. }

在RemoteDriver中除了一些简单的私有属性外,还定义了三个重要的内部类:JobWrapper、ClientListener和DriverProtocol。在上面分析的RemoteDriver的构造过程中我们会发现,这三个内部类的使用贯穿在整个RemoteDriver的实现当中,因此我们首分别看一些这几个内部类的作用和实现。

1. JobWrapper

JobWrapper实现了Callable<Void>接口,再结合名字我们就很容易猜到该类是对job的封装,可以提交执行。 我们直接进入到JobWrapper中的方法,成员变量在方法中进行介绍。

首先作为一个callable的任务,核心方法就是call方法,我们来看一下JobWrapper中call方法的实现。

第一步是t通过protocol发送jobStarted消息

   
   
  1. @Override
  2. public Void call() throws Exception {
  3. protocol.jobStarted(req.id);

然后则是执行JobWrapper中封装的job自身的call方法
    
    
  1. T result = req.job.call(jc);

通过future判断job是否完成
     
     
  1. for (JavaFutureAction<?> future : jobs) {
  2. future.get();
  3. completed++;
  4. LOG.debug("Client job {}: {} of {} Spark jobs finished.",
  5. req.id, completed, jobs.size());
  6. }

接下来,如果job不为空,通过jobEndReceived的值来等待TaskEnd/JobEnd事件都已经处理完,因为 jobEndReceived只会在JobEnd事件的处理方法中通过调用jobDone来更新
     
     
  1. if (sparkJobId != null) {
  2. SparkJobInfo sparkJobInfo = jc.sc().statusTracker().getJobInfo(sparkJobId);
  3. if (sparkJobInfo != null && sparkJobInfo.stageIds() != null &&
  4. sparkJobInfo.stageIds().length > 0) {
  5. synchronized (jobEndReceived) {
  6. while (jobEndReceived.get() < jobs.size()) {
  7. jobEndReceived.wait();
  8. }
  9. }
  10. }
  11. }

然后获取sparkCoutner,发送jobFinished消息,结果置为前面job.call得到的result,错误置为null
      
      
  1. SparkCounters counters = null;
  2. if (sparkCounters != null) {
  3. counters = sparkCounters.snapshot();
  4. }
  5. protocol.jobFinished(req.id, result, null, counters);

如果过程中出现错误,则同样要发送JobFinished消息,只不过这次结果置为null,错误置为捕获到的t
      
      
  1. } catch (Throwable t) {
  2. // Catch throwables in a best-effort to report job status back to the client. It\'s
  3. // re-thrown so that e executor can destroy the affected thread (or the JVM can
  4. // die or whatever would happen if the throwable bubbled up).
  5. LOG.info("Failed to run job " + req.id, t);
  6. protocol.jobFinished(req.id, null, t,
  7. sparkCounters != null ? sparkCounters.snapshot() : null);
  8. throw new ExecutionException(t);

最后在活跃job列表activeJobs中移除job,释放全部缓存的RDD
       
       
  1. } finally {
  2. jc.setMonitorCb(null);
  3. activeJobs.remove(req.id);
  4. releaseCache();
  5. }
  6. return null;

以上就是call方法的内容。下面我们来看一下JobWrapper中的其他几个方法。

submit方法是供外部调用的提交任务的方法,它内部启动一个线程来提交执行JobWrapper

   
   
  1. void submit() {
  2. this.future = executor.submit(this);
  3. }

jobDone方法用于更新jobEndReceived

   
   
  1. void jobDone() {
  2. synchronized (jobEndReceived) {
  3. jobEndReceived.incrementAndGet();
  4. jobEndReceived.notifyAll();
  5. }
  6. }

当job完成后通过releaseCache方法释放缓存的RDD

   
   
  1. void releaseCache() {
  2. if (cachedRDDIds != null) {
  3. for (Integer cachedRDDId: cachedRDDIds) {
  4. jc.sc().sc().unpersistRDD(cachedRDDId, false);
  5. }
  6. }
  7. }


2. ClientListener

ClientListener继承自JavaSparkListener,用来监听来自Spark Scheduler的事件。ClientListener覆盖了父类中三个处理事件的方法。

当job开始时,触发onJobStart方法,将job的stage id和jobId保存到stageId这个hashmap中

   
   
  1. @Override
  2. public void onJobStart(SparkListenerJobStart jobStart) {
  3. synchronized (stageToJobId) {
  4. for (int i = 0; i < jobStart.stageIds().length(); i++) {
  5. stageToJobId.put((Integer) jobStart.stageIds().apply(i), jobStart.jobId());
  6. }
  7. }
  8. }

当job结束时,触发onJobEnd方法,首先从stageToJobId中删除该与该job对应的所有<stageId, jobId>
    
    
  1. @Override
  2. public void onJobEnd(SparkListenerJobEnd jobEnd) {
  3. synchronized (stageToJobId) {
  4. for (Iterator<Map.Entry<Integer, Integer>> it = stageToJobId.entrySet().iterator();
  5. it.hasNext();) {
  6. Map.Entry<Integer, Integer> e = it.next();
  7. if (e.getValue() == jobEnd.jobId()) {
  8. it.remove();
  9. }
  10. }
  11. }

然后获取该job对应的client的id,然后根据clientId,调用JobWrapper的jobDone方法到activeJobs中更新相应job的JobEndReceived
     
     
  1. String clientId = getClientId(jobEnd.jobId());
  2. if (clientId != null) {
  3. activeJobs.get(clientId).jobDone();
  4. }

当一个task结束时,触发onTaskEnd方法,通过protocol发送相应的task m etrics
     
     
  1. @Override
  2. public void onTaskEnd(SparkListenerTaskEnd taskEnd) {
  3. if (taskEnd.reason() instanceof org.apache.spark.Success$
  4. && !taskEnd.taskInfo().speculative()) {
  5. Metrics metrics = new Metrics(taskEnd.taskMetrics());
  6. Integer jobId;
  7. synchronized (stageToJobId) {
  8. jobId = stageToJobId.get(taskEnd.stageId());
  9. }
  10. // TODO: implement implicit AsyncRDDActions conversion instead of jc.monitor()?
  11. // TODO: how to handle stage failures?
  12. String clientId = getClientId(jobId);
  13. if (clientId != null) {
  14. protocol.sendMetrics(clientId, jobId, taskEnd.stageId(),
  15. taskEnd.taskInfo().taskId(), metrics);
  16. }
  17. }
  18. }

3. DriverProtocol

RemoteDriver中定义了一个主要的Handler:DriverProtocol extends BaseProtocol。DriverProtocol的实现整体与ClientProtocol类似,而且两者正是SparkClient与RemoteDriver实现互相通信的组件。DriverProtocol中定义了一些发送消息的方法,其实现同样是通过调用Rpc的call方法发送不同类型的消息,这些消息的类型恰好与ClientProtocol中的几个handle方法能够处理的消息类型吻合

   
   
  1. private class DriverProtocol extends BaseProtocol {
  2. //发送Error类型消息
  3. void sendError(Throwable error) {
  4. LOG.debug("Send error to Client: {}", Throwables.getStackTraceAsString(error));
  5. clientRpc.call(new Error(error));
  6. }
  7. //发送JobResult类型消息
  8. <T extends Serializable> void jobFinished(String jobId, T result,
  9. Throwable error, SparkCounters counters) {
  10. LOG.debug("Send job({}) result to Client.", jobId);
  11. clientRpc.call(new JobResult(jobId, result, error, counters));
  12. }
  13. //发送JobStarted类型消息
  14. void jobStarted(String jobId) {
  15. clientRpc.call(new JobStarted(jobId));
  16. }
  17. //发送JobSubmitted类型消息
  18. void jobSubmitted(String jobId, int sparkJobId) {
  19. LOG.debug("Send job({}/{}) submitted to Client.", jobId, sparkJobId);
  20. clientRpc.call(new JobSubmitted(jobId, sparkJobId));
  21. }
  22. //发送JobMetrics类型消息
  23. void sendMetrics(String jobId, int sparkJobId, int stageId, long taskId, Metrics metrics) {
  24. LOG.debug("Send task({}/{}/{}/{}) metric to Client.", jobId, sparkJobId, stageId, taskId);
  25. clientRpc.call(new JobMetrics(jobId, sparkJobId, stageId, taskId, metrics));
  26. }

而DriverProtocol中的handle方法所处理的消息类型刚好是SparkClient中发送消息的几个方法所发送的消息类型。

处理CancelJob类型消息时

   
   
  1. private void handle(ChannelHandlerContext ctx, CancelJob msg) {
  2. JobWrapper<?> job = activeJobs.get(msg.id);
  3. if (job == null || !cancelJob(job)) {
  4. LOG.info("Requested to cancel an already finished job.");
  5. }
  6. }


处理EndSession类型消息时,直接调用RemoteDriver的shutdown方法关闭RemoteDriver

   
   
  1. private void handle(ChannelHandlerContext ctx, EndSession msg) {
  2. LOG.debug("Shutting down due to EndSession request.");
  3. shutdown(null);
  4. }

处理JobRequest类型消息时,将消息封装到JobWrapper里,然后将JobWrapper添加到活跃任务列表activeJobs里并执行。关于提交任务时job context还没有就绪的情况的处理放在了submit方法中,因此这里不许要额外处理
    
    
  1. private void handle(ChannelHandlerContext ctx, JobRequest msg) {
  2. LOG.info("Received job request {}", msg.id);
  3. JobWrapper<?> wrapper = new JobWrapper<Serializable>(msg);
  4. activeJobs.put(msg.id, wrapper);
  5. submit(wrapper);
  6. }

处理SyncJobRequest类型消息时,需要先判断当前job context是否已经就绪,如果没有的话需要等待
     
     
  1. private Object handle(ChannelHandlerContext ctx, SyncJobRequest msg) throws Exception {
  2. // In case the job context is not up yet, let\'s wait, since this is supposed to be a
  3. // "synchronous" RPC.
  4. LOG.debug("liban: DriverProtocol received SyncJobRequest msg. waiting for jc to be up");
  5. if (jc == null) {
  6. synchronized (jcLock) {
  7. while (jc == null) {
  8. jcLock.wait();
  9. if (!running) {
  10. throw new IllegalStateException("Remote context is shutting down.");
  11. }
  12. }
  13. }
  14. }

然后添加Monitorcallback,直接调用job的call方法来执行任务
      
      
  1. jc.setMonitorCb(new MonitorCallback() {
  2. @Override
  3. public void call(JavaFutureAction<?> future,
  4. SparkCounters sparkCounters, Set<Integer> cachedRDDIds) {
  5. throw new IllegalStateException(
  6. "JobContext.monitor() is not available for synchronous jobs.");
  7. }
  8. });
  9. try {
  10. LOG.debug("liban: type of job in SyncJobRequest msg: " + msg.job.getClass().getSimpleName());
  11. return msg.job.call(jc);
  12. } finally {
  13. jc.setMonitorCb(null);
  14. }

这里我们会发现一点,同样是封装了job的消息,但是JobRequest和SyncJobRequest的处理方式却不同。原因就在于这两种消息是针对不同的job进行封装的。通过查找调用链发现, SyncJobRequest用来封装GetAppIDJob、GetStageInfoJob、AddjarJob、AddFileJob、GetExecutorCountJob、GetDefaultParallelismJob和GetJobInfo类型的job,这些类型的job可以快速执行结束,因此发送 SyncJobRequest的方法是在SparkClient中的run方法被调用,会在处理消息的方法的线程中直接执行。
而JobRequest中只封装一种类型的job:JobStatusJob。由于任务最后是通过调用job的call方法执行的,因此我们直接看一下这个job的call方法
   
   
  1. @Override
  2. public Serializable call(JobContext jc) throws Exception {
  3. JobConf localJobConf = KryoSerializer.deserializeJobConf(jobConfBytes);
  4. // Add jar to current thread class loader dynamically, and add jar paths to JobConf as Spark
  5. // may need to load classes from this jar in other threads.
  6. Map<String, Long> addedJars = jc.getAddedJars();
  7. if (addedJars != null && !addedJars.isEmpty()) {
  8. List<String> localAddedJars = SparkClientUtilities.addToClassPath(addedJars,
  9. localJobConf, jc.getLocalTmpDir());
  10. localJobConf.set(Utilities.HIVE_ADDED_JARS, StringUtils.join(localAddedJars, ";"));
  11. }
  12. // 反序列化出本地临时目录路径和SparkWork
  13. Path localScratchDir = KryoSerializer.deserialize(scratchDirBytes, Path.class);
  14. SparkWork localSparkWork = KryoSerializer.deserialize(sparkWorkBytes, SparkWork.class);
  15. logConfigurations(localJobConf);
  16. //获取sparkCounter
  17. SparkCounters sparkCounters = new SparkCounters(jc.sc());
  18. Map<String, List<String>> prefixes = localSparkWork.getRequiredCounterPrefix();
  19. if (prefixes != null) {
  20. for (String group : prefixes.keySet()) {
  21. for (String counterName : prefixes.get(group)) {
  22. sparkCounters.createCounter(group, counterName);
  23. }
  24. }
  25. }
  26. // 通过sparkCounter构造sparkReporter
  27. SparkReporter sparkReporter = new SparkReporter(sparkCounters);
  28. // Generate Spark plan
  29. SparkPlanGenerator gen =
  30. new SparkPlanGenerator(jc.sc(), null, localJobConf, localScratchDir, sparkReporter);
  31. SparkPlan plan = gen.generate(localSparkWork);
  32. // Execute generated plan.
  33. JavaPairRDD<HiveKey, BytesWritable> finalRDD = plan.generateGraph();
  34. // We use Spark RDD async action to submit job as it\'s the only way to get jobId now.
  35. JavaFutureAction<Void> future = finalRDD.foreachAsync(HiveVoidFunction.getInstance());
  36. jc.monitor(future, sparkCounters, plan.getCachedRDDIds());
  37. return null;
  38. }

我们可以看到,这个方法是最终会触发整个SparkWork的DAG执行的方法,也就是说JobStatusJob是包含了SparkWork主题的job类型,当前面的addjar,addfile等操作完成后,客户端会发送JobRequest类型消息,提交 JobStatusJob在Spark集群上执行。

至此,关于提交任务的几个重要组件就全部分析完了,从打开session到最后SparkWork变成RDD开始执行,整个执行链条也已经清晰了。



  • 2
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值