Hadoop源码分析（二）----------job提交过程分析（2）

最新推荐文章于 2024-08-02 11:32:05 发布

小玉歌

最新推荐文章于 2024-08-02 11:32:05 发布

阅读量474

点赞数

分类专栏： hadoop源码分析文章标签：开发者开源源码软件阅读

本文链接：https://blog.csdn.net/u011332758/article/details/41575299

版权

hadoop源码分析专栏收录该内容

3 篇文章 0 订阅

订阅专栏

前面我们所分析的部分其实只是Hadoop作业提交的前奏曲，真正的作业提交代码是在MR程序的main里，RunJar在最后会动态调用这个main，在之前有说明。我们下面要做的就是要比RunJar更进一步，让作业提交能在编码时就可实现，就像HadoopEclipse Plugin那样可以对包含Mapper和Reducer的MR类直接Run on Hadoop。

　　一般来说，每个MR程序都会有这么一段类似的作业提交代码，这里拿WordCount的举例：

    Configuration conf = new Configuration();
    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
    if (otherArgs.length != 2) {
      System.err.println("Usage: wordcount <in> <out>");
      System.exit(2);
    }
    Job job = new Job(conf, "word count");
    job.setJarByClass(WordCount.class);
    job.setMapperClass(TokenizerMapper.class);
    job.setCombinerClass(IntSumReducer.class);
    job.setReducerClass(IntSumReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
    System.exit(job.waitForCompletion(true) ? 0 : 1);

　　首先要做的是构建一个Configuration对象，并进行参数解析。接着构建提交作业用的Job对象，并设置作业Jar包、对应Mapper和Reducer类、输入输出的Key和Value的类及作业的输入和输出路径，最后就是提交作业并等待作业结束。这些只是比较基本的设置参数，实际还支持更多的设置参数，这里就不一一介绍，详细的可参考API文档。

一般分析代码都从开始一步步分析，但我们的重点是分析提交过程中发生的事，这里我们先不理前面的设置对后面作业的影响，我们直接跳到作业提交那一步进行分析，当碰到问题需要分析前面的代码时我会再分析。

public booleanwaitForCompletion(boolean verbose
                                   ) throwsIOException, InterruptedException,
                                           ClassNotFoundException {
    if (state ==JobState.DEFINE) {
      submit();
    }
    if(verbose) {
      monitorAndPrintJob();
    } else {
      // get the completion pollinterval from the client.
      intcompletionPollIntervalMillis =
        Job.getCompletionPollInterval(cluster.getConf());
      while(!isComplete()) {
        try {
          Thread.sleep(completionPollIntervalMillis);
        } catch(InterruptedException ie) {
        }
      }
    }
    return isSuccessful();
 }

当调用job.waitForCompletion时，Job如果已经初始化好，立即调用submit()函数,之后调用monitorAndPrintJob()检查Job和Task的运行状况，或者自身进入循环，以一定的时间间隔轮询检查所提交的Job是是否执行完成。如果执行完成，跳出循环，调用isSuccessful()函数返回执行后的状态。

其内部调用的是submit方法来提交，如果传入参数为ture则及时打印作业运作信息，否则只是等待作业结束。

  /**
   * Submit the job to the cluster and returnimmediately.
   * @throwsIOException
   */
  public voidsubmit()
         throwsIOException, InterruptedException, ClassNotFoundException {
    ensureState(JobState.DEFINE);
    setUseNewAPI();
    connect();
    finalJobSubmitter submitter =
        getJobSubmitter(cluster.getFileSystem(),cluster.getClient());
    status = ugi.doAs(newPrivilegedExceptionAction<JobStatus>() {
      publicJobStatus run() throws IOException,InterruptedException,
      ClassNotFoundException {
        returnsubmitter.submitJobInternal(Job.this, cluster);
      }
    });
    state = JobState.RUNNING;
    LOG.info("Theurl to track the job: " + getTrackingURL());
   }

submit方法进去后，还有一层，里面用到了job对象内部的jobSubmitter对象的submitJobInternal来提交作业，从这个方法才开始做正事。

/**
   * Internal method for submitting jobs to thesystem.
   *
   * <p>Thejob submission process involves:
   * <ol>
   *   <li>
   *  Checking the input and output specifications of the job.
   *   </li>
   *   <li>
   *  Computing the {@link InputSplit}sfor the job.
   *   </li>
   *   <li>
   *  Setup the requisite accounting information for the
   *   {@linkDistributedCache} of the job, if necessary.
   *   </li>
   *   <li>
   *  Copying the job's jar and configuration to the map-reducesystem
   *  directory on the distributed file-system.
   *   </li>
   *   <li>
   *  Submitting the job to the <code>JobTracker</code>and optionally
   *  monitoring it's status.
   *   </li>
   * </ol></p>
   * @paramjob the configuration to submit
   * @paramcluster the handle to the Cluster
   * @throwsClassNotFoundException
   * @throwsInterruptedException
   * @throwsIOException
   */
  JobStatus submitJobInternal(Job job, Clustercluster)
  throwsClassNotFoundException, InterruptedException, IOException {
 
    //validate the jobs output specs
    checkSpecs(job);
 
    Configuration conf =job.getConfiguration();
    addMRFrameworkToDistributedCache(conf);
 
    Path jobStagingArea = JobSubmissionFiles.getStagingDir(cluster,conf);
    //configure the command lineoptions correctly on the submitting dfs
    InetAddress ip = InetAddress.getLocalHost();
    if (ip!= null) {
      submitHostAddress =ip.getHostAddress();
      submitHostName =ip.getHostName();
      conf.set(MRJobConfig.JOB_SUBMITHOST,submitHostName);
      conf.set(MRJobConfig.JOB_SUBMITHOSTADDR,submitHostAddress);
    }
    JobID jobId = submitClient.getNewJobID();
    job.setJobID(jobId);
    Path submitJobDir = newPath(jobStagingArea, jobId.toString());
    JobStatus status = null;
    try {
      conf.set(MRJobConfig.USER_NAME,
          UserGroupInformation.getCurrentUser().getShortUserName());
      conf.set("hadoop.http.filter.initializers",
          "org.apache.hadoop.yarn.server.webproxy.amfilter.AmFilterInitializer");
      conf.set(MRJobConfig.MAPREDUCE_JOB_DIR,submitJobDir.toString());
      LOG.debug("Configuringjob " + jobId + " with " +submitJobDir
          + " as thesubmit dir");
      // get delegation token for thedir
      TokenCache.obtainTokensForNamenodes(job.getCredentials(),
          newPath[] { submitJobDir }, conf);
     
      populateTokenCache(conf,job.getCredentials());
 
      // generate a secret toauthenticate shuffle transfers
      if(TokenCache.getShuffleSecretKey(job.getCredentials()) == null) {
        KeyGenerator keyGen;
        try {
          keyGen = KeyGenerator.getInstance(SHUFFLE_KEYGEN_ALGORITHM);
          keyGen.init(SHUFFLE_KEY_LENGTH);
        } catch(NoSuchAlgorithmException e) {
          throw newIOException("Error generating shuffle secret key", e);
        }
        SecretKey shuffleKey =keyGen.generateKey();
        TokenCache.setShuffleSecretKey(shuffleKey.getEncoded(),
            job.getCredentials());
      }
 
      copyAndConfigureFiles(job, submitJobDir);
      Path submitJobFile = JobSubmissionFiles.getJobConfPath(submitJobDir);
     
      // Create the splits for thejob
      LOG.debug("Creatingsplits at " + jtFs.makeQualified(submitJobDir));
      int maps= writeSplits(job, submitJobDir);
      conf.setInt(MRJobConfig.NUM_MAPS,maps);
      LOG.info("numberof splits:" + maps);
 
      // write "queue adminsof the queue to which job is being submitted"
      // to job file.
      String queue = conf.get(MRJobConfig.QUEUE_NAME,
          JobConf.DEFAULT_QUEUE_NAME);
      AccessControlList acl = submitClient.getQueueAdmins(queue);
      conf.set(toFullPropertyName(queue,
          QueueACL.ADMINISTER_JOBS.getAclName()),acl.getAclString());
 
      // removing jobtokenreferrals before copying the jobconf to HDFS
      // as the tasks don't need thissetting, actually they may break
      // because of it if present asthe referral will point to a
      // different job.
     TokenCache.cleanUpTokenReferral(conf);
 
      if(conf.getBoolean(
          MRJobConfig.JOB_TOKEN_TRACKING_IDS_ENABLED,
          MRJobConfig.DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED)) {
        // Add HDFStracking ids
        ArrayList<String> trackingIds = newArrayList<String>();
        for(Token<? extendsTokenIdentifier> t :
           job.getCredentials().getAllTokens()) {
         trackingIds.add(t.decodeIdentifier().getTrackingId());
        }
        conf.setStrings(MRJobConfig.JOB_TOKEN_TRACKING_IDS,
            trackingIds.toArray(newString[trackingIds.size()]));
      }
 
      // Write job file to submit dir
      writeConf(conf, submitJobFile);
     
      //
      // Now, actually submit the job(using the submit name)
      //
      printTokens(jobId, job.getCredentials());
      status = submitClient.submitJob(
          jobId, submitJobDir.toString(),job.getCredentials());
      if(status != null) {
        returnstatus;
      } else {
        throw newIOException("Could not launch job");
      }
    } finally {
      if(status == null) {
        LOG.info("Cleaningup the staging area " + submitJobDir);
        if (jtFs != null&& submitJobDir != null)
          jtFs.delete(submitJobDir,true);
 
      }
    }
  }

submitJobInternal（）函数主要进行如下操作

· 检查Job的输入输出是各项参数，获取配置信息和远程主机的地址，生成JobID，确定所需工作目录(也是MRAppMaster.java所在目录)，执行期间设置必要的信息

· 拷贝所需要的Jar文件和配置文件信息到HDFS系统上的指定工作目录，以便各个节点调用使用

· 计算并获数去输入分片(Input Split)的数目，以确定map的个数

· 调用YARNRunner类下的submitJob()函数，提交Job，传出相应的所需参数(例如 JobID等)。

· 等待submit()执行返回Job执行状态，最后删除相应的工作目录。

接下来我们一步一步分析这个过程，首先，进去第一件事就是获取存储空间，建立submitJobDir文件夹，用来存放job提交所需信息，接下来获取jobId，然后设置一些依赖库，配置文件等的路径，并进行安全性检查；然后进行job输入文件的切割，也就是split过程，主要通过

int maps= writeSplits(job, submitJobDir);

</pre><pre name="code" class="java">进行文件的split，这里的split只是一种逻辑的split，并没有进行物理存储的切割，切割完成后，每一个split对应一个map，并将job.split也就是split元数据写入到submitJobDir文件夹中，之后随job一块提交给resoursemanager，在YARN上运行；

</pre><pre name="code" class="java">下一篇，我会重点分析<pre name="code" class="java" style="color: rgb(51, 51, 51);">writeSplits(job, submitJobDir)

的执行过程，也就是job是怎样进行split并生成split文件的。。。。。。。

小玉歌

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hadoop源码分析（二）----------job提交过程分析（2）

前面我们所分析的部分其实只是Hadoop作业提交的前奏曲，真正的作业提交代码是在MR程序的main里，RunJar在最后会动态调用这个main，在之前有说明。我们下面要做的就是要比RunJar更进一步，让作业提交能在编码时就可实现，就像HadoopEclipse Plugin那样可以对包含Mapper和Reducer的MR类直接Run on Hadoop。　　一般来说，每个MR程序都会
复制链接

扫一扫