hadoop MapReduce源码分析(源码流程&word count日志)

最新推荐文章于 2023-07-30 00:58:34 发布

master-dragon

最新推荐文章于 2023-07-30 00:58:34 发布

阅读量889

点赞数

分类专栏： hadoop/hive/hbase

本文链接：https://blog.csdn.net/qq_26437925/article/details/78517772

版权

本文深入分析Hadoop MapReduce的Job提交过程，包括MapTask的输入split计算、MapTask运行流程，详细解读map阶段的输入、处理和输出，以及shuffle和reduce阶段的关键步骤。通过理解源码，揭示MapReduce任务执行的内部机制。

摘要由CSDN通过智能技术生成

Job提交

 /**
   * Internal method for submitting jobs to the system.
   * 
   * <p>The job submission process involves:
   * <ol>
   *   <li>
   *   Checking the input and output specifications of the job.
   *   </li>
   *   <li>
   *   Computing the {@link InputSplit}s for the job.
   *   </li>
   *   <li>
   *   Setup the requisite accounting information for the 
   *   {@link DistributedCache} of the job, if necessary.
   *   </li>
   *   <li>
   *   Copying the job's jar and configuration to the map-reduce system
   *   directory on the distributed file-system. 
   *   </li>
   *   <li>
   *   Submitting the job to the <code>JobTracker</code> and optionally
   *   monitoring it's status.
   *   </li>
   * </ol></p>
   * @param job the configuration to submit
   * @param cluster the handle to the Cluster
   * @throws ClassNotFoundException
   * @throws InterruptedException
   * @throws IOException
   */
  JobStatus submitJobInternal(Job job, Cluster cluster)

  // 把任务相关的文件，配置，jars上传
  copyAndConfigureFiles(job, submitJobDir);

  // 获取配置文件job.xml的路径
  Path submitJobFile = JobSubmissionFiles.getJobConfPath(submitJobDir);

  // 输入文件的splits,配置信息写入job信息中
  // Create the splits for the job
  LOG.debug("Creating splits at " + jtFs.makeQualified(submitJobDir));
  int maps = writeSplits(job, submitJobDir);
  conf.setInt(MRJobConfig.NUM_MAPS, maps);
  LOG.info("number of splits:" + maps);

  int maxMaps = conf.getInt(MRJobConfig.JOB_MAX_MAP,
      MRJobConfig.DEFAULT_JOB_MAX_MAP);
  if (maxMaps >= 0 && maxMaps < maps) {
   
    throw new IllegalArgumentException("The number of map tasks " + maps +
        " exceeded limit " + maxMaps);
  }

  // 设置job使用的资源队列
  // write "queue admins of the queue to which job is being submitted"
  // to job file.
  String queue = conf.get(MRJobConfig.QUEUE_NAME,
      JobConf.DEFAULT_QUEUE_NAME);
  AccessControlList acl = submitClient.getQueueAdmins(queue);
  conf.set(toFullPropertyName(queue,
      QueueACL.ADMINISTER_JOBS.getAclName()), acl.getAclString());

  // removing jobtoken referrals before copying the jobconf to HDFS
  // as the tasks don't need this setting, actually they may break
  // because of it if present as the referral will point to a
  // different job.
  TokenCache.cleanUpTokenReferral(conf);

  if (conf.getBoolean(
      MRJobConfig.JOB_TOKEN_TRACKING_IDS_ENABLED,
      MRJobConfig.DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED)) {
   
    // Add HDFS tracking ids
    ArrayList<String> trackingIds = new ArrayList<String>();
    for (Token<? extends TokenIdentifier> t :
        job.getCredentials().getAllTokens()) {
   
      trackingIds.add(t.decodeIdentifier().getTrackingId());
    }
    conf.setStrings(MRJobConfig.JOB_TOKEN_TRACKING_IDS,
        trackingIds.toArray(new String[trackingIds.size()]));
  }

  // Set reservation info if it exists
  ReservationId reservationId = job.getReservationId();
  if (reservationId != null) {
   
    conf.set(MRJobConfig.RESERVATION_ID, reservationId.toString());
  }

  // Write job file to submit dir
  writeConf(conf, submitJobFile);

  // submitClient.submitJob 提交Job
  //
  // Now, actually submit the job (using the submit name)
  //
  printTokens(jobId, job.getCredentials());
  status = submitClient.submitJob(
      jobId, submitJobDir.toString(), job.getCredentials());
  if (status != null) {
   
    return status;
  } else {
   
    throw new IOException("Could not launch job");
  }
} finally {
   
  if (status == null) {
   
    LOG.info("Cleaning up the staging area " + submitJobDir);
    if (jtFs != null && submitJobDir != null)
      jtFs.delete(submitJobDir, true);

  }
}

输入文件 writeNewSplits, 计算有多少个maps

@SuppressWarnings("unchecked")
 private <T extends InputSplit>
 int writeNewSplits(JobContext job, Path jobSubmitDir) throws IOException,
     InterruptedException, ClassNotFoundException {
   
   Configuration conf = job.getConfiguration();
   // 获取输入文件格式，默认:TextInputFormat
   InputFormat<?, ?> input =
     ReflectionUtils.newInstance(job.getInputFormatClass(), conf);

   // 进行split操作
   List<InputSplit> splits = input.getSplits(job);
   T[] array = (T[]) splits.toArray(new InputSplit[splits.size()]);

   // 排序，让最大的split优先处理
   // sort the splits into order based on size, so that the biggest
   // go first
   Arrays.sort(array, new SplitComparator());

   // 把split文件信息写入
   JobSplitWriter.createSplitFiles(jobSubmitDir, conf, 
       jobSubmitDir.getFileSystem(conf), array);
   return array.length;
 }

MapReduce Job执行的架构(1.0简单描述)

在这里插入图片描述

client

用户编写的 MapReduce 程序通过 Client 提交到 JobTracker 端；同时，用户可通过 Client 提供的一些接口查看作业运行状态。在 Hadoop 内部用“作业”（Job）表示 MapReduce 程序。一个 MapReduce 程序可对应若干个作业，而每个作业会被分解成若干个 Map/Reduce 任务（Task）。

JobTracker

JobTracker 主要负责资源监控和作业调度。JobTracker 监控所有 TaskTracker 与作业的健康状况，一旦发现失败情况后，其会将相应的任务转移到其他节点；同时，JobTracker 会跟踪任务的执行进度、资源使用量等信息，并将这些信息告诉任务调度器，而调度器会在资源出现空闲时，选择合适的任务使用这些资源。在 Hadoop 中，任务调度器是一个可插拔的模块，用户可以根据自己的需要设计相应的调度器。

TaskTracker

TaskTracker 会周期性地通过 Heartbeat 将本节点上资源的使用情况和任务的运行进度汇报给 JobTracker，同时接收 JobTracker 发送过来的命令并执行相应的操作（如启动新任务、杀死任务等）。TaskTracker 使用“slot”等量划分本节点上的资源量。“slot”代表计算资源（CPU、内存等）。一个 Task 获取到一个 slot 后才有机会运行，而 Hadoop 调度器的作用就是将各个 TaskTracker 上的空闲 slot 分配给 Task 使用。slot 分为 Map slot 和 Reduce slot 两种，分别供 Map Task 和 Reduce Task 使用。TaskTracker 通过 slot 数目（可配置参数）限定 Task 的并发度。

Task

Task 分为 Map Task 和 Reduce Task 两种，均由 TaskTracker 启动。从上一小节中我们知道， HDFS 以固定大小的 block 为基本单位存储数据，而对于 MapReduce 而言，其处理单位是 split。

mapTask的输入:splits(有多少个maps?)

可以通过配置BlockSize、split min size、split max size 等参数，达到控制mapper的数量(注意到：即使在程序里对 conf 显式地设置了 mapred.map.tasks 或 mapreduce.job.maps，程序不一定能运行期望数量的 mapper)

input.getSplits(job); （org.apache.hadoop.mapreduce.lib.input FileInputFormat）

/** 
  * Generate the list of files and make them into FileSplits.
  * @param job the job context
  * @throws IOException
  */
 public List<InputSplit> getSplits(JobContext job) throws IOException {
   
   StopWatch sw = new StopWatch().start();
   // split的最大最小值，可配置
   long minSize = Math.max(getFormatMinSplitSize(), getMinSplitSize(job));
   long maxSize = getMaxSplitSize(job);

   // 获取job信息，并判断产生split
   // generate splits
   List<InputSplit> splits = new ArrayList<InputSplit>();
   List<FileStatus> files = listStatus(job);

   boolean ignoreDirs = !getInputDirRecursive(job)
     && job.getConfiguration().getBoolean(INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, false);
   for (FileStatus file: files) {
   
     if (ignoreDirs && file.isDirectory()) {
   
       continue;
     }
     Path path = file.getPath();
     long length = file.getLen();
     if (length != 0) {
   
       // 获取输入文件的所有 block 信息
       BlockLocation[] blkLocations;
       if (file instanceof LocatedFileStatus) {
   
         blkLocations = ((LocatedFileStatus) file).getBlockLocations();
       } else {
   
         FileSystem fs = path.getFileSystem(job.getConfiguration());
         blkLocations = fs.getFileBlockLocations(file, 0, length);
       }
       if (isSplitable(job, path)) {
   
         // 计算 splitSize: Math.max(minSize, Math.min(maxSize, blockSize));
         long blockSize = file.getBlockSize();
         long splitSize = computeSplitSize(blockSize, minSize, maxSize);

         long bytesRemaining = length;
         while (((double) bytesRemaining)/splitSize > SPLIT_SLOP) {
   
           int blkIndex = getBlockIndex(blkLocations, length-bytesRemaining);
           splits.add(makeSplit(path, length-bytesRemaining, splitSize,
                       blkLocations[blkIndex].getHosts(),
                       blkLocations[blkIndex].getCachedHosts()));
           bytesRemaining -= splitSize;
         }

         if (bytesRemaining != 0) {
   
           int blkIndex = getBlockIndex(blkLocations, length-bytesRemaining);
           splits.add(makeSplit(path, length-bytesRemaining, bytesRemaining,
                      blkLocations[blkIndex].getHosts(),
                      blkLocations[blkIndex].getCachedHosts()));
         }
       } else {
    // not splitable
         if (LOG.isDebugEnabled()) {
   
           // Log only if the file is big enough to be splitted
           if (length > Math.min(file.getBlockSize(), minSize)) {
   
             LOG.debug("File is not splittable so no parallelization "
                 + "is possible: " + file.getPath());
           }
         }
         splits.add(makeSplit(path, 0, length, blkLocations[0].getHosts(),
                     blkLocations[0].getCachedHosts()));
       }
     } else {
    
       //Create empty hosts array for zero length files
       splits.add(makeSplit(path, 0, length, new String[0]));
     }
   }
   // Save the number of input files for metrics/loadgen
   job.getConfiguration().setLong(NUM_INPUT_FILES, files.size());
   sw.stop();
   if (LOG.isDebugEnabled()) {
   
     LOG.debug("Total # of splits generated by getSplits: " + splits.size()
         + ", TimeTaken: " + sw.now(TimeUnit.MILLISECONDS));
   }
   return splits;
 }

split 大小如何确定？

splitSize = max(minSize,min(maxSize,blockSIze))

eg:

min	max	block	split
1M	100M	64M	64M
128M	512M	64M	128M
1M	32M	64M	32M

在这里插入图片描述

// 输入文件的splits,配置信息写入job信息中
// Create the splits for the job
 LOG.debug("Creating splits at " + jtFs.makeQualified(submitJobDir));
 int maps = writeSplits(job, submitJobDir);
 conf.setInt(MRJobConfig.NUM_MAPS, maps);
 LOG.info("number of splits:" + maps);

 int maxMaps = conf.getInt(MRJobConfig.JOB_MAX_MAP,
     MRJobConfig.DEFAULT_JOB_MAX_MAP);
 if (maxMaps >= 0 && maxMaps < maps) {
   
   throw new IllegalArgumentException("The number of map tasks " + maps +
       " exceeded limit " + maxMaps);
 }

split 的多少决定了 Map Task 的数目，因为每个 split 会交由一个 Map Task 处理

MapTask run

MapTask的run方法流程

@Override
  public void run(final JobConf job, final TaskUmbilicalProtocol umbilical)
    throws IOException, ClassNotFoundException, InterruptedException {
   
    this.umbilical = umbilical;

    if (isMapTask()) {
   
      // If there are no reducers then there won't be any sort. Hence the map 
      // phase will govern the entire attempt's progress.
      if (conf.getNumReduceTasks() == 0) {
   
        mapPhase = getProgress().addPhase("map", 1.0f);
      } else {
   
        // If there are reducers then the entire attempt's progress will be 
        // split between the map phase (67%) and the sort phase (33%).
        mapPhase = getProgress().addPhase("map", 0.667f);
        sortPhase  = getProgress().addPhase("sort", 0.333f);
      }
    }
    TaskReporter reporter = startReporter(umbilical);
 
    boolean useNewApi = job.getUseNewMapper();
    initialize(job, getJobID(), reporter, useNewApi);

    // check if it is a cleanupJobTask
    if (jobCleanup) {
   
      runJobCleanupTask(umbilical, reporter);
      return;
    }
    if (jobSetup) {
   
      runJobSetupTask(umbilical, reporter);
      return;
    }
    if (taskCleanup) {
   
      runTaskCleanupTask(umbilical, reporter);
      return;
    }

    if (useNewApi) {
   
      runNewMapper(job, splitMetaInfo, umbilical, reporter);
    } else {
   
      runOldMapper(job, splitMetaInfo, umbilical, reporter);
    }
    done(umbilical, reporter);
  }

MapTask先判断是否有Reduce任务；没有reduce, 则只有map阶段，Map阶段结束则整个job可以结束
有reduce任务，则map阶段会有个排序：Map占66.7%，Sort占33.3%

(补充：排序的好处：方便后续的reduce读取减少IO次数)

 @SuppressWarnings("unchecked")
  private <INKEY,INVALUE,OUTKEY,OUTVALUE>
  void runNewMapper(final JobConf job,
                    final TaskSplitIndex splitIndex,
                    final TaskUmbilicalProtocol umbilical,
                    TaskReporter reporter
                    ) throws IOException, ClassNotFoundException,
                             InterruptedException {
   
    // make a task context so we can get the classes
    org.apache.hadoop.mapreduce.TaskAttemptContext taskContext =
      new org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl(job, 
                                                                  getTaskID(),
                                                                  reporter);
    // 构造 mapper, 用客户端自己手写的Mapper函数反射构造
    // 每个 mapper 处理 一个 split, job是构造了整个split数组信息的
    // make a mapper
    org.apache.hadoop.mapreduce.Mapper<INKEY,INVALUE,OUTKEY,OUTVALUE> mapper =
      (org.apache.hadoop.mapreduce.Mapper<INKEY,INVALUE,OUTKEY,OUTVALUE>)
        ReflectionUtils.newInstance(taskContext.getMapperClass(), job);
    // make the input format
    org.apache.hadoop.mapreduce.InputFormat<INKEY,INVALUE> inputFormat =
      (org.apache.hadoop.mapreduce.InputFormat<INKEY,INVALUE>)
        ReflectionUtils.newInstance(taskContext.getInputFormatClass(), job);
    // rebuild the input split
    org.apache.hadoop.mapreduce.InputSplit split = null;
    split = getSplitDetails(new Path(splitIndex.getSplitLocation()),
        splitIndex.getStartOffset());
    LOG.info("Processing split: " + split);

    // 从split获取信息，读取真正的block得到原始的一条一条的记录
    org.apache.hadoop.mapreduce.RecordReader<INKEY,INVALUE> input =
      new NewTrackingRecordReader<INKEY,INVALUE>
        (split, inputFormat, reporter, taskContext);
    
    job.setBoolean(JobContext.SKIP_RECORDS, isSkipping());
    org.apache.hadoop.mapreduce.RecordWriter output = null;
    
    // get an output object
    if