hadoop-0.20.205.0 MapReduce流程分析

Hadoop JobTracker的启动

org.apache.hadoop.mapred.JobTracker类实现了Hadoop MapReduce模型的中JobTracker的功能,主要负责任务的接受,初始化,调度以及对TaskTracker的监控。

在Hadoop中JobTracker单独作为一个JVM运行,其main函数就是启动JobTracker的入口函数。

  public static void main(String argv[]
                          ) throws IOException, InterruptedException {
    StringUtils.startupShutdownMessage(JobTracker.class, argv, LOG);
    
    try {
      if(argv.length == 0) {
        // 创建一个JobTracker实例
        JobTracker tracker = startTracker(new JobConf());
        // JobTracker提供服务
        tracker.offerService();
      }
      else {
        if ("-dumpConfiguration".equals(argv[0]) && argv.length == 1) {
          dumpConfiguration(new PrintWriter(System.out));
        }
        else {
          System.out.println("usage: JobTracker [-dumpConfiguration]");
          System.exit(-1);
        }
      }
    } catch (Throwable e) {
      LOG.fatal(StringUtils.stringifyException(e));
      System.exit(-1);
    }
  }
startTracker(new JobConf())函数是一个静态函数,通过调用JobTracker的构造函数生成一个JobTracker实例,名为result。
然后,对result进行一系列的初始化,包括创建TaskScheduler,启动RPC server,启动内置的jetty服务器,检查是否需要重启JobTracker等。
  public static JobTracker startTracker(JobConf conf
                                        ) throws IOException,
                                                 InterruptedException {
    return startTracker(conf, generateNewIdentifier());
  }
  
  public static JobTracker startTracker(JobConf conf, String identifier) 
  throws IOException, InterruptedException {
    DefaultMetricsSystem.initialize("JobTracker");
    JobTracker result = null;
    while (true) {
      try {
        // 创建JobTracker实例
        result = new JobTracker(conf, identifier);
        // 设置TaskScheduler的Manager为JobTracker,
        // 从此处可以看出TaskScheduler和JobTracker是相互包含的,即你中有我、我中有你。
        result.taskScheduler.setTaskTrackerManager(result);

       .....

    return result;
  }
在JobTracker.offerService()中,又调用了任务调度器taskScheduler对象的start()方法,用来启动任务调度器。该对象是JobTracker的一个数据成员,类型为TaskScheduler。该类型的提供了一系列接口,使得JobTracker可以对所有提交的job进行初始化以及调度。但是该类型实际上是一个抽象类型,其真正的实现类型是我们在配置文件中指定的“mapred.jobtracker.taskScheduler”,默认是JobQueueTaskScheduler类。
  /**
   * Run forever
   */
  public void offerService() throws InterruptedException, IOException {
    // Prepare for recovery. This is done irrespective of the status of restart
    .....

    // 启动任务调度器
    taskScheduler.start();
    
    .....
  }

Hadoop TaskScheduler的启动

下面我们看一下默认的任务调度器JobQueueTaskScheduler的start():
  @Override
  public synchronized void start() throws IOException {
    super.start();
    // 注册一个JobQueueInProgressListener监听器
    // JobQueueInProgress类以一定方式(默认是先进先出(FIFO))维持一个JobInProgress的队列,
    // 并且监听各个JobInProgress实例在生命周期中的变化
    taskTrackerManager.addJobInProgressListener(jobQueueJobInProgressListener);
    // 注册一个EagerTaskInitializationListener监听器
    // EagerInitializationListener类不断监听jobInitQueue,
    // 一旦发现有新的job被提交(即有新的JobInProgress实例被加入),
    // 则立即调用该实例的initTasks方法,对job进行初始化。
    eagerTaskInitializationListener.setTaskTrackerManager(taskTrackerManager);
    eagerTaskInitializationListener.start();
    taskTrackerManager.addJobInProgressListener(eagerTaskInitializationListener);
  }
  /
  //  Used to init new jobs that have just been created
  /
  class JobInitManager implements Runnable {
   
    public void run() {
      JobInProgress job = null;
      while (true) {
        try {
          synchronized (jobInitQueue) {
            while (jobInitQueue.isEmpty()) {
              jobInitQueue.wait();
            }
            job = jobInitQueue.remove(0);
          }
          threadPool.execute(new InitJob(job));
        } catch (InterruptedException t) {
          LOG.info("JobInitManagerThread interrupted.");
          break;
        } 
      }
      LOG.info("Shutting down thread pool");
      threadPool.shutdownNow();
    }
  }
  
  class InitJob implements Runnable {
  
    private JobInProgress job;
    
    public InitJob(JobInProgress job) {
      this.job = job;
    }
    
    public void run() {
      // 调用TaskTrackerManager(即JobTracker)的initJob(JobInProgress job)方法
      // 在JobTracker的initJob中又是通过调用JobInProgress.initTasks()方法完成job的初始化的
      ttm.initJob(job);
    }
  }

Hadoop Job的初始化

下面我看一下Job的初始化过程:
  /**
   * Construct the splits, etc.  This is invoked from an async
   * thread so that split-computation doesn't block anyone.
   */
  public synchronized void initTasks() 
  throws IOException, KillInterruptedException, UnknownHostException {
    .....
    
    //
    // read input splits and create a map per a split
    //
    // 从JobTracker获取Job输文件的split元信息(TaskSplitMetaInfo),
    // 包括分片位置(splitLocation)、偏移量(startOffset)、输入数据长度(inputDataLength)、location
    TaskSplitMetaInfo[] splits = createSplits(jobId);
    if (numMapTasks != splits.length) {
      throw new IOException("Number of maps in JobConf doesn't match number of " +
      		"recieved splits for job " + jobId + "! " +
      		"numMapTasks=" + numMapTasks + ", #splits=" + splits.length);
    }
    // map task的个数就是input split个数,即一个map处理一个input split
    numMapTasks = splits.length;
    //
    // Create Map Tasks
    //
    maps = new TaskInProgress[numMapTasks];
    for(int i=0; i < numMapTasks; ++i) {
      inputLength += splits[i].getInputDataLength();
      maps[i] = new TaskInProgress(jobId, jobFile, 
                                   splits[i], 
                                   jobtracker, conf, this, i, numSlotsPerMap);
    }
    LOG.info("Input size for job " + jobId + " = " + inputLength
        + ". Number of splits = " + splits.length);

    // Set localityWaitFactor before creating cache
    localityWaitFactor = conf.getFloat(LOCALITY_WAIT_FACTOR, DEFAULT_LOCALITY_WAIT_FACTOR);

    // 对于map task,将其放入nonRunningMapCache,是一个Map<Node,
    // List<TaskInProgress>>,也即对于map task来讲,其将会被分配到其input split所在的Node上。
    // 在此,Node代表一个datanode或者机架或者数据中心。
    // nonRunningMapCache将在JobTracker向TaskTracker分配map task的时候使用。
    if (numMapTasks > 0) { 
      nonRunningMapCache = createCache(splits, maxLevel);
    }
        
    // set the launch time
    this.launchTime = jobtracker.getClock().getTime();

    //
    // Create reduce tasks
    //
    // nonRunningReduces将在JobTracker向TaskTracker分配reduce task的时候使用。
    this.reduces = new TaskInProgress[numReduceTasks];
    for (int i = 0; i < numReduceTasks; i++) {
      reduces[i] = new TaskInProgress(jobId, jobFile, 
                                      numMapTasks, i, 
                                      jobtracker, conf, this, numSlotsPerReduce);
      nonRunningReduces.add(reduces[i]);
    }

    .....
    
    // create cleanup two cleanup tips, one map and one reduce.
    // 创建两个cleanup taskinprogress,一个用来清理map task,一个用来清理reduce task
    cleanup = new TaskInProgress[2];

    // cleanup map tip. This map doesn't use any splits. Just assign an empty
    // split.
    TaskSplitMetaInfo emptySplit = JobSplit.EMPTY_TASK_SPLIT;
    cleanup[0] = new TaskInProgress(jobId, jobFile, emptySplit, 
            jobtracker, conf, this, numMapTasks, 1);
    cleanup[0].setJobCleanupTask();

    // cleanup reduce tip.
    cleanup[1] = new TaskInProgress(jobId, jobFile, numMapTasks,
                       numReduceTasks, jobtracker, conf, this, 1);
    cleanup[1].setJobCleanupTask();

    // create two setup tips, one map and one reduce.
    // 创建两个setip taskinprogress, 一个用来初始化map task, 一个用来初始化reduce task
    setup = new TaskInProgress[2];

    // setup map tip. This map doesn't use any split. Just assign an empty
    // split.
    setup[0] = new TaskInProgress(jobId, jobFile, emptySplit, 
            jobtracker, conf, this, numMapTasks + 1, 1);
    setup[0].setJobSetupTask();

    // setup reduce tip.
    setup[1] = new TaskInProgress(jobId, jobFile, numMapTasks,
                       numReduceTasks + 1, jobtracker, conf, this, 1);
    setup[1].setJobSetupTask();

    .....
    // 初始化完成    
    tasksInited = true;
    .....
  }

Hadoop TaskTracker的启动

org.apache.hadoop.mapred.TaskTracker类实现了MapReduce模型中TaskTracker的功能。在Hadoop中,TaskTracker作为一个单独的JVM来运行的,其main函数就是TaskTracker的入口函数。当运行start-all.sh时,通过SSH运行该函数来启动TaskTracker的。
  /**
   * Start the TaskTracker, point toward the indicated JobTracker
   */
  public static void main(String argv[]) throws Exception {
    StringUtils.startupShutdownMessage(TaskTracker.class, argv, LOG);
    if (argv.length != 0) {
      System.out.println("usage: TaskTracker");
      System.exit(-1);
    }
    try {
      JobConf conf=new JobConf();
      // enable the server to track time spent waiting on locks
      ReflectionUtils.setContentionTracing
        (conf.getBoolean("tasktracker.contention.tracking", false));
      DefaultMetricsSystem.initialize("TaskTracker");
      // 创建TaskTracker实例
      TaskTracker tt = new TaskTracker(conf);
      MBeans.register("TaskTracker", "TaskTrackerInfo", tt);
      // 启动Tasktracker的服务
      tt.run();
    } catch (Throwable e) {
      LOG.error("Can not start task tracker because "+
                StringUtils.stringifyException(e));
      System.exit(-1);
    }
  }
其中TaskTracker的run()又调用了offerService(),该函数每隔一定的时间就想JobTracker发送heartbeat,报告当前TaskTracker的状态:
  /**
   * Main service loop.  Will stay in this loop forever.
   */
  State offerService() throws Exception {
    long lastHeartbeat = 0;

    while (running && !shuttingDown) {
      try {
        long now = System.currentTimeMillis();

        long waitTime = heartbeatInterval - (now - lastHeartbeat);
        if (waitTime > 0) {
          // 睡眠一段时间(waitTime),或者直到有空闲slot
          synchronized (finishedCount) {
            if (finishedCount.get() == 0) {
              finishedCount.wait(waitTime);
            }
            finishedCount.set(0);
          }
        }

        .....

        // Send the heartbeat and process the jobtracker's directives
        // 向JobTracker发送heartbeat,得到JobTracker的回复heartbeatResponse
        HeartbeatResponse heartbeatResponse = transmitHeartBeat(now);

        lastHeartbeat = System.currentTimeMillis();
        
        .....

        //从HeartbeatResponse中获取TaskTracker的所需要做的事情(actions)
        TaskTrackerAction[] actions = heartbeatResponse.getActions();

        .....

        if (actions != null){ 
          for(TaskTrackerAction action: actions) {
            // 如果是启动一个新task,则就其加入到mapLaunch或者是reduceLaunch中
            // mapLaunch和reduceLaunch是一个TaskLauncher,其包括了一个List<TaskInProgress>成员
            if (action instanceof LaunchTaskAction) {
              addToTaskQueue((LaunchTaskAction)action);
            } else if (action instanceof CommitTaskAction) {
              CommitTaskAction commitAction = (CommitTaskAction)action;
              if (!commitResponses.contains(commitAction.getTaskID())) {
                LOG.info("Received commit task action for " + 
                          commitAction.getTaskID());
                commitResponses.add(commitAction.getTaskID());
              }
            } else {
              tasksToCleanup.put(action);
            }
          }
        }
        
    .....

    return State.NORMAL;
  }

TaskTracker向JobTracker发送Heartbeat

  /**
   * Build and transmit the heart beat to the JobTracker
   * @param now current time
   * @return false if the tracker was unknown
   * @throws IOException
   */
  HeartbeatResponse transmitHeartBeat(long now) throws IOException {
    // Send Counters in the status once every COUNTER_UPDATE_INTERVAL
    boolean sendCounters;
    if (now > (previousUpdate + COUNTER_UPDATE_INTERVAL)) {
      sendCounters = true;
      previousUpdate = now;
    }
    else {
      sendCounters = false;
    }

    // 
    // Check if the last heartbeat got through... 
    // if so then build the heartbeat information for the JobTracker;
    // else resend the previous status information.
    //
    if (status == null) {
      synchronized (this) {
        status = new TaskTrackerStatus(taskTrackerName, localHostname, 
                                       httpPort, 
                                       cloneAndResetRunningTaskStatuses(
                                         sendCounters), 
                                       failures, 
                                       maxMapSlots,
                                       maxReduceSlots); 
      }
    } else {
      LOG.info("Resending 'status' to '" + jobTrackAddr.getHostName() +
               "' with reponseId '" + heartbeatResponseId);
    }
      
    //
    // Check if we should ask for a new Task
    //
    boolean askForNewTask;
    long localMinSpaceStart;
    // 当满足下面的条件的时候,此TaskTracker请求JobTracker为其分配一个新的Task来运行:
    // 1)当前TaskTracker正在运行的map task的个数小于可以运行的map task的最大个数
    // 2)当前TaskTracker正在运行的reduce task的个数小于可以运行的reduce task的最大个数
    synchronized (this) {
      askForNewTask = 
        ((status.countOccupiedMapSlots() < maxMapSlots || 
          status.countOccupiedReduceSlots() < maxReduceSlots) && acceptNewTasks); 
      localMinSpaceStart = minSpaceStart;
    }
    
    .....

    // 向TaskTracker发送heartbeat,RPC调用,调用的是JobTracker的heartbeat方法
    HeartbeatResponse heartbeatResponse = jobClient.heartbeat(status, 
                                                              justStarted,
                                                              justInited,
                                                              askForNewTask, 
                                                              heartbeatResponseId);
      
    .....

    return heartbeatResponse;
  }

JobTracker向TaskTracker分配任务

TaskTracker每隔一定的时间,就通过RPC调用(JobTracker.heartbeat())向JobTracker发送heartbeat报告当前TaskTracker的状态。
JobTracker根据TaskTracker的状态来决定返回给TaskTracker执行的actions
  /**
   * The periodic heartbeat mechanism between the {@link TaskTracker} and
   * the {@link JobTracker}.
   * 
   * The {@link JobTracker} processes the status information sent by the 
   * {@link TaskTracker} and responds with instructions to start/stop 
   * tasks or jobs, and also 'reset' instructions during contingencies. 
   */
  public synchronized HeartbeatResponse heartbeat(TaskTrackerStatus status, 
                                                  boolean restarted,
                                                  boolean initialContact,
                                                  boolean acceptNewTasks, 
                                                  short responseId) 
    throws IOException {
    .....

    // First check if the last heartbeat response got through
    String trackerName = status.getTrackerName();
    
    .....

    HeartbeatResponse prevHeartbeatResponse =
      trackerToHeartbeatResponseMap.get(trackerName);
    boolean addRestartInfo = false;

    .....
      
    // Initialize the response to be sent for the heartbeat
    HeartbeatResponse response = new HeartbeatResponse(newResponseId, null);
    List<TaskTrackerAction> actions = new ArrayList<TaskTrackerAction>();

    boolean isBlacklisted = faultyTrackers.isBlacklisted(status.getHost());
    // Check for new tasks to be executed on the tasktracker
    if (recoveryManager.shouldSchedule() && acceptNewTasks && !isBlacklisted) {
      TaskTrackerStatus taskTrackerStatus = getTaskTrackerStatus(trackerName);
      if (taskTrackerStatus == null) {
        LOG.warn("Unknown task tracker polling; ignoring: " + trackerName);
      } else {
        List<Task> tasks = getSetupAndCleanupTasks(taskTrackerStatus);
        if (tasks == null ) {
          // 调用任务调度器TaskScheduler的assignTasks向该TaskTracker分配task
          tasks = taskScheduler.assignTasks(taskTrackers.get(trackerName));
        }
        if (tasks != null) {
          for (Task task : tasks) {
            // 将分配给改TaskTracker的tasks,加入到actions列表中
            expireLaunchingTasks.addNewTask(task.getTaskID());
            if(LOG.isDebugEnabled()) {
              LOG.debug(trackerName + " -> LaunchTask: " + task.getTaskID());
            }
            actions.add(new LaunchTaskAction(task));
          }
        }
      }
    }
      
    .....

    // calculate next heartbeat interval and put in heartbeat response
    int nextInterval = getNextHeartbeatInterval();
    response.setHeartbeatInterval(nextInterval);
    response.setActions(actions.toArray(new TaskTrackerAction[actions.size()]));
   
    .....
        
    return response;
  }

任务调度器分配任务——TaskScheduler.assignTasks()

目前在Hadoop中比较常用的任务调度器有三种:JobQueueTaskScheduler,FairScheduler,CapacityScheduler。
下面我们看一下Hadoop的默认任务调度——JobQueueTaskScheduler是如何分配任务的。
  @Override
  public synchronized List<Task> assignTasks(TaskTracker taskTracker)
      throws IOException {
    TaskTrackerStatus taskTrackerStatus = taskTracker.getStatus(); 
    ClusterStatus clusterStatus = taskTrackerManager.getClusterStatus();
    final int numTaskTrackers = clusterStatus.getTaskTrackers();
    final int clusterMapCapacity = clusterStatus.getMaxMapTasks();
    final int clusterReduceCapacity = clusterStatus.getMaxReduceTasks();

    Collection<JobInProgress> jobQueue =
      jobQueueJobInProgressListener.getJobQueue();

    //
    // Get map + reduce counts for the current tracker.
    //
    final int trackerMapCapacity = taskTrackerStatus.getMaxMapSlots();
    final int trackerReduceCapacity = taskTrackerStatus.getMaxReduceSlots();
    final int trackerRunningMaps = taskTrackerStatus.countMapTasks();
    final int trackerRunningReduces = taskTrackerStatus.countReduceTasks();

    // Assigned tasks
    List<Task> assignedTasks = new ArrayList<Task>();

    //
    // Compute (running + pending) map and reduce task numbers across pool
    //
    int remainingReduceLoad = 0;
    int remainingMapLoad = 0;
    synchronized (jobQueue) {
      for (JobInProgress job : jobQueue) {
        if (job.getStatus().getRunState() == JobStatus.RUNNING) {
          remainingMapLoad += (job.desiredMaps() - job.finishedMaps());
          if (job.scheduleReduces()) {
            remainingReduceLoad += 
              (job.desiredReduces() - job.finishedReduces());
          }
        }
      }
    }

    // Compute the 'load factor' for maps and reduces
    // 计算每个TaskTracker的负载
    double mapLoadFactor = 0.0;
    if (clusterMapCapacity > 0) {
      mapLoadFactor = (double)remainingMapLoad / clusterMapCapacity;
    }
    double reduceLoadFactor = 0.0;
    if (clusterReduceCapacity > 0) {
      reduceLoadFactor = (double)remainingReduceLoad / clusterReduceCapacity;
    }
     
    // 先分配map task, 然后分配reduce task。
    //
    // In the below steps, we allocate first map tasks (if appropriate),
    // and then reduce tasks if appropriate.  We go through all jobs
    // in order of job arrival; jobs only get serviced if their 
    // predecessors are serviced, too.
    //

    //
    // We assign tasks to the current taskTracker if the given machine 
    // has a workload that's less than the maximum load of that kind of
    // task.
    // However, if the cluster is close to getting loaded i.e. we don't
    // have enough _padding_ for speculative executions etc., we only 
    // schedule the "highest priority" task i.e. the task from the job 
    // with the highest priority.
    //
    
    final int trackerCurrentMapCapacity = 
      Math.min((int)Math.ceil(mapLoadFactor * trackerMapCapacity), trackerMapCapacity);
    int availableMapSlots = trackerCurrentMapCapacity - trackerRunningMaps;
    boolean exceededMapPadding = false;
    if (availableMapSlots > 0) {
      exceededMapPadding = 
        exceededPadding(true, clusterStatus, trackerMapCapacity);
    }
    
    int numLocalMaps = 0;
    int numNonLocalMaps = 0;
    scheduleMaps:
    for (int i=0; i < availableMapSlots; ++i) {
      synchronized (jobQueue) {
        for (JobInProgress job : jobQueue) {
          if (job.getStatus().getRunState() != JobStatus.RUNNING) {
            continue;
          }

          Task t = null;
          
          // Try to schedule a node-local or rack-local Map task
          t = job.obtainNewNodeOrRackLocalMapTask(taskTrackerStatus, 
                numTaskTrackers, taskTrackerManager.getNumberOfUniqueHosts());
          if (t != null) {
            assignedTasks.add(t);
            ++numLocalMaps;
            
            // Don't assign map tasks to the hilt!
            // Leave some free slots in the cluster for future task-failures,
            // speculative tasks etc. beyond the highest priority job
            if (exceededMapPadding) {
              break scheduleMaps;
            }
           
            // Try all jobs again for the next Map task 
            break;
          }
          
          // Try to schedule a non-local Map task
          t = job.obtainNewNonLocalMapTask(taskTrackerStatus, numTaskTrackers,
                                   taskTrackerManager.getNumberOfUniqueHosts());
          
          if (t != null) {
            assignedTasks.add(t);
            ++numNonLocalMaps;
            
            // We assign at most 1 off-switch or speculative task
            // This is to prevent TaskTrackers from stealing local-tasks
            // from other TaskTrackers.
            break scheduleMaps;
          }
        }
      }
    }
    int assignedMaps = assignedTasks.size();

    //
    // Same thing, but for reduce tasks
    // However we _never_ assign more than 1 reduce task per heartbeat
    //
    final int trackerCurrentReduceCapacity = 
      Math.min((int)Math.ceil(reduceLoadFactor * trackerReduceCapacity), 
               trackerReduceCapacity);
    final int availableReduceSlots = 
      Math.min((trackerCurrentReduceCapacity - trackerRunningReduces), 1);
    boolean exceededReducePadding = false;
    if (availableReduceSlots > 0) {
      exceededReducePadding = exceededPadding(false, clusterStatus, 
                                              trackerReduceCapacity);
      synchronized (jobQueue) {
        for (JobInProgress job : jobQueue) {
          if (job.getStatus().getRunState() != JobStatus.RUNNING ||
              job.numReduceTasks == 0) {
            continue;
          }

          Task t = 
            job.obtainNewReduceTask(taskTrackerStatus, numTaskTrackers, 
                                    taskTrackerManager.getNumberOfUniqueHosts()
                                    );
          if (t != null) {
            assignedTasks.add(t);
            break;
          }
          
          // Don't assign reduce tasks to the hilt!
          // Leave some free slots in the cluster for future task-failures,
          // speculative tasks etc. beyond the highest priority job
          if (exceededReducePadding) {
            break;
          }
        }
      }
    }
    
    .....

    return assignedTasks;
  }
从上面的代码中我们可以知道:
1)JobInProgress的obtainNewNodeOrRackLocalMapTask()和obtainNewNonLocalMapTask()是用来分配map task的;
      通过调用findNewMapTask(),findNewMapTask()函数会从近到远一层一层地寻找。首先是同一节点(node-local),然后在寻找同一机柜上的节点(rack-local),
      接着寻找相同数据中心下的节点(off-switch),直到找了maxLevel层结束。根据TaskTracker所在的Node从nonRunningMapCache中查找TaskInProgress。
2)JobInProgress的obtainNewReduceTask()是用来分配reduce task的,通过调用findNewReduceTask(),从nonRunningReduces查找TaskInProgress。

FindNewMapTask() & FindNewReduceTask()方法

  /**
   * Find new map task
   * @param tts The task tracker that is asking for a task
   * @param clusterSize The number of task trackers in the cluster
   * @param numUniqueHosts The number of hosts that run task trackers
   * @param avgProgress The average progress of this kind of task in this job
   * @param maxCacheLevel The maximum topology level until which to schedule
   *                      maps. 
   *                      A value of {@link #anyCacheLevel} implies any 
   *                      available task (node-local, rack-local, off-switch and 
   *                      speculative tasks).
   *                      A value of {@link #NON_LOCAL_CACHE_LEVEL} implies only
   *                      off-switch/speculative tasks should be scheduled.
   * @return the index in tasks of the selected task (or -1 for no task)
   */
  private synchronized int findNewMapTask(final TaskTrackerStatus tts, 
                                          final int clusterSize,
                                          final int numUniqueHosts,
                                          final int maxCacheLevel,
                                          final double avgProgress) {
    .....

    String taskTracker = tts.getTrackerName();
    TaskInProgress tip = null;
    
    .....
    
    // When scheduling a map task:
    //  0) Schedule a failed task without considering locality
    //  1) Schedule non-running tasks
    //  2) Schedule speculative tasks
    //  3) Schedule tasks with no location information

    // First a look up is done on the non-running cache and on a miss, a look 
    // up is done on the running cache. The order for lookup within the cache:
    //   1. from local node to root [bottom up]
    //   2. breadth wise for all the parent nodes at max level
    // We fall to linear scan of the list ((3) above) if we have misses in the above caches


    // 0) Schedule the task with the most failures, unless failure was on this machine
    tip = findTaskFromList(failedMaps, tts, numUniqueHosts, false);
    if (tip != null) {
      // Add to the running list
      scheduleMap(tip);
      LOG.info("Choosing a failed task " + tip.getTIPId());
      return tip.getIdWithinJob();
    }

    Node node = jobtracker.getNode(tts.getHost());
    
    //
    // 1) Non-running TIP :
    // 

    // 1. check from local node to the root [bottom up cache lookup]
    //    i.e if the cache is available and the host has been resolved
    //    (node!=null)
    if (node != null) {
      Node key = node;
      int level = 0;
      // maxCacheLevel might be greater than this.maxLevel if findNewMapTask is
      // called to schedule any task (local, rack-local, off-switch or
      // speculative) tasks or it might be NON_LOCAL_CACHE_LEVEL (i.e. -1) if
      // findNewMapTask is (i.e. -1) if findNewMapTask is to only schedule
      // off-switch/speculative tasks
      int maxLevelToSchedule = Math.min(maxCacheLevel, maxLevel);
      for (level = 0;level < maxLevelToSchedule; ++level) {
        List <TaskInProgress> cacheForLevel = nonRunningMapCache.get(key);
        if (cacheForLevel != null) {
          tip = findTaskFromList(cacheForLevel, tts, 
              numUniqueHosts,level == 0);
          if (tip != null) {
            // Add to running cache
            scheduleMap(tip);

            // remove the cache if its empty
            if (cacheForLevel.size() == 0) {
              nonRunningMapCache.remove(key);
            }

            return tip.getIdWithinJob();
          }
        }
        key = key.getParent();
      }
      
      // Check if we need to only schedule a local task (node-local/rack-local)
      if (level == maxCacheLevel) {
        return -1;
      }
    }

    //2. Search breadth-wise across parents at max level for non-running 
    //   TIP if
    //     - cache exists and there is a cache miss 
    //     - node information for the tracker is missing (tracker's topology
    //       info not obtained yet)

    // collection of node at max level in the cache structure
    Collection<Node> nodesAtMaxLevel = jobtracker.getNodesAtMaxLevel();

    // get the node parent at max level
    Node nodeParentAtMaxLevel = 
      (node == null) ? null : JobTracker.getParentNode(node, maxLevel - 1);
    
    for (Node parent : nodesAtMaxLevel) {

      // skip the parent that has already been scanned
      if (parent == nodeParentAtMaxLevel) {
        continue;
      }

      List<TaskInProgress> cache = nonRunningMapCache.get(parent);
      if (cache != null) {
        tip = findTaskFromList(cache, tts, numUniqueHosts, false);
        if (tip != null) {
          // Add to the running cache
          scheduleMap(tip);

          // remove the cache if empty
          if (cache.size() == 0) {
            nonRunningMapCache.remove(parent);
          }
          LOG.info("Choosing a non-local task " + tip.getTIPId());
          return tip.getIdWithinJob();
        }
      }
    }

    // 3. Search non-local tips for a new task
    tip = findTaskFromList(nonLocalMaps, tts, numUniqueHosts, false);
    if (tip != null) {
      // Add to the running list
      scheduleMap(tip);

      LOG.info("Choosing a non-local task " + tip.getTIPId());
      return tip.getIdWithinJob();
    }

    //
    // 2) Running TIP :
    // 
 
    if (hasSpeculativeMaps) {
      long currentTime = jobtracker.getClock().getTime();

      // 1. Check bottom up for speculative tasks from the running cache
      if (node != null) {
        Node key = node;
        for (int level = 0; level < maxLevel; ++level) {
          Set<TaskInProgress> cacheForLevel = runningMapCache.get(key);
          if (cacheForLevel != null) {
            tip = findSpeculativeTask(cacheForLevel, tts, 
                                      avgProgress, currentTime, level == 0);
            if (tip != null) {
              if (cacheForLevel.size() == 0) {
                runningMapCache.remove(key);
              }
              return tip.getIdWithinJob();
            }
          }
          key = key.getParent();
        }
      }

      // 2. Check breadth-wise for speculative tasks
      
      for (Node parent : nodesAtMaxLevel) {
        // ignore the parent which is already scanned
        if (parent == nodeParentAtMaxLevel) {
          continue;
        }

        Set<TaskInProgress> cache = runningMapCache.get(parent);
        if (cache != null) {
          tip = findSpeculativeTask(cache, tts, avgProgress, 
                                    currentTime, false);
          if (tip != null) {
            // remove empty cache entries
            if (cache.size() == 0) {
              runningMapCache.remove(parent);
            }
            LOG.info("Choosing a non-local task " + tip.getTIPId() 
                     + " for speculation");
            return tip.getIdWithinJob();
          }
        }
      }

      // 3. Check non-local tips for speculation
      tip = findSpeculativeTask(nonLocalRunningMaps, tts, avgProgress, 
                                currentTime, false);
      if (tip != null) {
        LOG.info("Choosing a non-local task " + tip.getTIPId() 
                 + " for speculation");
        return tip.getIdWithinJob();
      }
    }
    
    return -1;
  }
(此处有点没看明白,回头再琢磨琢磨!尴尬)

TaskTracker接收JobTracker的HeartbeatResponse

TaskTracker向JobTracker发送heartbeat后,如果Jobtracker返回的heartbeatreponse中含有分配好的任务LaunchTaskAction,
TaskTracker则调用addToTaskQueue方法,将其加入TaskTracker类中MapLauncher或者ReduceLauncher对象的taskToLaunch队列。
在此,MapLauncher和ReduceLauncher对象均为TaskLauncher类的实例。
该类是TaskTracker类的一个内部类,具有一个数据成员,是TaskTracker.TaskInProgress类型的队列。
需要特别注意的是,在TaskTracker类内部所提到的TaskInProgress类均为TaskTracker的内部类要与mapred包中的TaskInProgress类区分开来!
在此我们用TaskTracker.TaskInProgress表示。如果heartbeatresponse中包含的任务是map task则放入mapLancher的taskToLaunch队列,如果是reduce task则放入reduceLancher的taskToLaunch队列:
下面我们看一下TaskTracker的addToTaskQueue方法:
  private void addToTaskQueue(LaunchTaskAction action) {
    if (action.getTask().isMapTask()) {
      // 调用TaskLauncher的addToTaskQueue方法
      mapLauncher.addToTaskQueue(action);
    } else {
      reduceLauncher.addToTaskQueue(action);
    }
  }
TaskLauncher的addToTaskQueue方法:
    public void addToTaskQueue(LaunchTaskAction action) {
      synchronized (tasksToLaunch) {
        TaskInProgress tip = registerTask(action, this);
        tasksToLaunch.add(tip);
        tasksToLaunch.notifyAll();
      }
    }
TaskTracker的registerTask方法:
  private TaskInProgress registerTask(LaunchTaskAction action, 
      TaskLauncher launcher) {
    Task t = action.getTask();
    LOG.info("LaunchTaskAction (registerTask): " + t.getTaskID() +
             " task's state:" + t.getState());
    TaskInProgress tip = new TaskInProgress(t, this.fConf, launcher);
    synchronized (this) {
      // 通知其他程序
      tasks.put(t.getTaskID(), tip);
      runningTasks.put(t.getTaskID(), tip);
      boolean isMap = t.isMapTask();
      if (isMap) {
        mapTotal++;
      } else {
        reduceTotal++;
      }
    }
    return tip;
  }
TaskLauncher类是一个Thread类,它们各自都以一个线程(即有两个线程mapTaskLauncher和reduceTaskLauncher)独立运行。
它们的启动在TaskTracker初始化过程中已经完成。该类的run函数会不断监测taskToLaunch队列中是否有新的TaskTracker.TaskInProgress对象加入。
如果有则从中取出一个对象,然后调用TaskTracker类的startNewTask(TaskInProgress tip)来启动一个task,其又主要调用了localizeJob(TaskInProgresstip),
该函数的工作就是实现本地化。
TaskLauncher的run()函数:
    public void run() {
      while (!Thread.interrupted()) {
        try {
          TaskInProgress tip;
          Task task;
          synchronized (tasksToLaunch) {
            // 不断监测tasksToLaunch队列
            while (tasksToLaunch.isEmpty()) {
              tasksToLaunch.wait();
            }
            //get the TIP
            tip = tasksToLaunch.remove(0);
            task = tip.getTask();
            LOG.info("Trying to launch : " + tip.getTask().getTaskID() + 
                     " which needs " + task.getNumSlotsRequired() + " slots");
          }
          //wait for free slots to run
          .....

          //got a free slot. launch the task
          startNewTask(tip);
       .....
      }
    }
在startNewTask中,会为每个task创建一个线程launchThread。TaskTracker的startNewTask()方法:
  void startNewTask(final TaskInProgress tip) throws InterruptedException {
    Thread launchThread = new Thread(new Runnable() {
      @Override
      public void run() {
        try {
          RunningJob rjob = localizeJob(tip);
          tip.getTask().setJobFile(rjob.getLocalizedJobConf().toString());
          // Localization is done. Neither rjob.jobConf nor rjob.ugi can be null
          launchTaskForJob(tip, new JobConf(rjob.getJobConf()), rjob); 
        } catch (Throwable e) {
        .....
      }
    });
   
    launchThread.start();
  }
task的本地化,TaskTracker的localizeJob()方法通过调用TaskTracker.initializeJob()来实现的。将与该Task相关的文件从HDFS拷贝的TaskTracker的本地文件系统中:job.split,job.xml以及job.jar。完成task的本地化后,通过调用launchTaskForJob()来启动task, 它又调用了TaskTracker.TaskInProgress的launchTask()方法。
    public synchronized void launchTask(RunningJob rjob) throws IOException {
      if (this.taskStatus.getRunState() == TaskStatus.State.UNASSIGNED ||
          this.taskStatus.getRunState() == TaskStatus.State.FAILED_UNCLEAN ||
          this.taskStatus.getRunState() == TaskStatus.State.KILLED_UNCLEAN) {
        // 
        localizeTask(task);
        if (this.taskStatus.getRunState() == TaskStatus.State.UNASSIGNED) {
          this.taskStatus.setRunState(TaskStatus.State.RUNNING);
        }
        // 创建一个TaskRunner对象,TaskRunner是一个抽象类,并且继承了Thread类
        // 实际运行时是MapTaskRunner 或者 ReduceTaskRunner
        setTaskRunner(task.createRunner(TaskTracker.this, this, rjob));
        // 调用MapTaskRunner或者ReduceTaskRunner的run()方法
        this.runner.start();
       .....
    }
TaskRunner的run()方法:
  @Override
  public final void run() {
    
     // 准备运行时所需要的参数(环境变量)包括classpath, workdir, child JVM args等
     .....
      // 启动一个子进程,创建一个新的JVM来执行新的程序(map或者reduce)
      launchJvmAndWait(setupCmds, vargs, stdout, stderr, logSize, workDir);
     .....
  }


  void launchJvmAndWait(List <String> setup, Vector<String> vargs, File stdout,
      File stderr, long logSize, File workDir)
      throws InterruptedException, IOException {
    jvmManager.launchJvm(this, jvmManager.constructJvmEnv(setup, vargs, stdout,
        stderr, logSize, workDir, conf));
    synchronized (lock) {
      while (!done) {
        lock.wait();
      }
    }
接下来就是创建JVM的过程:
JvmManager.launchJvm() -----> JvmManagerForType.reapJvm() ----->JvmManagerForType.reapJvm() ----->JvmManagerForType.spawnNewJvm()
----->JvmManagerForType.JvmRunner.run()----->JvmManagerForType.JvmRunner.runChild()----->TaskTracker.getTaskController().launchTask()
-----> ..... ----->Child.main()
  public static void main(String[] args) throws Throwable {
   
    .....

    try {
      while (true) {
        taskid = null;
        currentJobSegmented = true;
        // 通过网络通信从TaskTracker获取JvmTask对象
        JvmTask myTask = umbilical.getTask(context);
        
        .....
        
        // Create a final reference to the task for the doAs block
        final Task taskFinal = task;
        childUGI.doAs(new PrivilegedExceptionAction<Object>() {
          @Override
          public Object run() throws Exception {
            try {
              // use job-specified working directory
              FileSystem.get(job).setWorkingDirectory(job.getWorkingDirectory());
              // 调用MapTask或者ReduceTask对象的run方法
              taskFinal.run(job, umbilical);             // run the task
            } finally {
             .....
            }

            return null;
          }
        });

     .....
  }
MapTask.run()方法
  @Override
  public void run(final JobConf job, final TaskUmbilicalProtocol umbilical)
    throws IOException, ClassNotFoundException, InterruptedException {
    // 用来与TaskTracker通信
    this.umbilical = umbilical;

    // 创建一个TaskReporter线程用来想TaskTracker汇报Task的progress
    TaskReporter reporter = new TaskReporter(getProgress(), umbilical);
    reporter.startCommunicationThread();
    boolean useNewApi = job.getUseNewMapper();
    // 用来初始化任务,主要是进行一些和任务输出相关的设置,
    // 比如创建commiter,设置工作目录等
    // 调用了org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize
    initialize(job, getJobID(), reporter, useNewApi);

    // 判断Task的类型,若不属于下列四种,则是运行真正的map task
    if (jobCleanup) {
      runJobCleanupTask(umbilical, reporter);
      return;
    }
    if (jobSetup) {
      runJobSetupTask(umbilical, reporter);
      return;
    }
    if (taskCleanup) {
      runTaskCleanupTask(umbilical, reporter);
      return;
    }

    if (useNewApi) {
      runNewMapper(job, splitMetaInfo, umbilical, reporter);
    } else {
      runOldMapper(job, splitMetaInfo, umbilical, reporter);
    }
    done(umbilical, reporter);
  }
下面我们以使用新的MapReduce API为例说明。
  private <INKEY,INVALUE,OUTKEY,OUTVALUE>
  void runNewMapper(final JobConf job,
                    final TaskSplitIndex splitIndex,
                    final TaskUmbilicalProtocol umbilical,
                    TaskReporter reporter
                    ) throws IOException, ClassNotFoundException,
                             InterruptedException {
    // make a task context so we can get the classes
    // TaskAttemptContext类继承于JobContext类,在JobContext类的基础上增加了一些有关task的信息。
    // 通过taskContext对象可以获得很多与任务执行相关的类,比如用户定义的Mapper类,InputFormat类等等 
    org.apache.hadoop.mapreduce.TaskAttemptContext taskContext =
      new org.apache.hadoop.mapreduce.TaskAttemptContext(job, getTaskID());
    // 创建用户自定义的Mapper对象
    org.apache.hadoop.mapreduce.Mapper<INKEY,INVALUE,OUTKEY,OUTVALUE> mapper =
      (org.apache.hadoop.mapreduce.Mapper<INKEY,INVALUE,OUTKEY,OUTVALUE>)
        ReflectionUtils.newInstance(taskContext.getMapperClass(), job);
    // make the input format
    org.apache.hadoop.mapreduce.InputFormat<INKEY,INVALUE> inputFormat =
      (org.apache.hadoop.mapreduce.InputFormat<INKEY,INVALUE>)
        ReflectionUtils.newInstance(taskContext.getInputFormatClass(), job);
    // rebuild the input split
    org.apache.hadoop.mapreduce.InputSplit split = null;
    split = getSplitDetails(new Path(splitIndex.getSplitLocation()),
        splitIndex.getStartOffset());
    // 根据InputFormat对象创建RecordReader对象,默认是org.apache.hadoop.mapreduce.lib.input.LineRecordReader
    org.apache.hadoop.mapreduce.RecordReader<INKEY,INVALUE> input =
      new NewTrackingRecordReader<INKEY,INVALUE>
          (split, inputFormat, reporter, job, taskContext);

    job.setBoolean("mapred.skip.on", isSkipping());
    org.apache.hadoop.mapreduce.RecordWriter output = null;
    org.apache.hadoop.mapreduce.Mapper<INKEY,INVALUE,OUTKEY,OUTVALUE>.Context 
         mapperContext = null;
    try {
      Constructor<org.apache.hadoop.mapreduce.Mapper.Context> contextConstructor =
        org.apache.hadoop.mapreduce.Mapper.Context.class.getConstructor
        (new Class[]{org.apache.hadoop.mapreduce.Mapper.class,
                     Configuration.class,
                     org.apache.hadoop.mapreduce.TaskAttemptID.class,
                     org.apache.hadoop.mapreduce.RecordReader.class,
                     org.apache.hadoop.mapreduce.RecordWriter.class,
                     org.apache.hadoop.mapreduce.OutputCommitter.class,
                     org.apache.hadoop.mapreduce.StatusReporter.class,
                     org.apache.hadoop.mapreduce.InputSplit.class});

      // get an output object
      if (job.getNumReduceTasks() == 0) {
         output =
           new NewDirectOutputCollector(taskContext, job, umbilical, reporter);
      } else {
        output = new NewOutputCollector(taskContext, job, umbilical, reporter);
      }

      mapperContext = contextConstructor.newInstance(mapper, job, getTaskID(),
                                                     input, output, committer,
                                                     reporter, split);
      // 初始化,在默认情况下调用的是LineRecordReader的initialize方法,主要是打开输入文件并且将文件指针指向文件头
      input.initialize(split, mapperContext);
      // 运行用户自定义的Mapper中的map()方法
      mapper.run(mapperContext);
      input.close();
      output.close(mapperContext);
    } catch (NoSuchMethodException e) {
      throw new IOException("Can't find Context constructor", e);
    } catch (InstantiationException e) {
      throw new IOException("Can't create Context", e);
    } catch (InvocationTargetException e) {
      throw new IOException("Can't invoke Context constructor", e);
    } catch (IllegalAccessException e) {
      throw new IOException("Can't invoke Context constructor", e);
    }
  }
Mapper.run方法:
  /**
   * Called once at the beginning of the task.
   */
  protected void setup(Context context
                       ) throws IOException, InterruptedException {
    // NOTHING
  }

  /**
   * Called once for each key/value pair in the input split. Most applications
   * should override this, but the default is the identity function.
   */
  @SuppressWarnings("unchecked")
  protected void map(KEYIN key, VALUEIN value, 
                     Context context) throws IOException, InterruptedException {
    context.write((KEYOUT) key, (VALUEOUT) value);
  }

  /**
   * Called once at the end of the task.
   */
  protected void cleanup(Context context
                         ) throws IOException, InterruptedException {
    // NOTHING
  }
  
  /**
   * Expert users can override this method for more complete control over the
   * execution of the Mapper.
   * @param context
   * @throws IOException
   */
  public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    while (context.nextKeyValue()) {
      // 用户自定义Mapper中的map函数
      map(context.getCurrentKey(), context.getCurrentValue(), context);
    }
    cleanup(context);
  }
Mapper.run()方法首先调用了setup方法,这个方法在Mapper类中实际上是什么也没有做。
用户通过可重写此方法让程序在执行map函数之前进行一些其他操作。
然后,程序将不断获取键值对交给map函数处理,也就是用户所希望进行的操作。
最后,程序调用cleanup函数。这个方法和setup一样,也是Mapper类的一个方法,但是实际上什么也没有做。用户可以重写此方法进行一些收尾工作。
至此,map的工作已经完成了。

ReduceTask的运行过程
  @Override
  @SuppressWarnings("unchecked")
  public void run(JobConf job, final TaskUmbilicalProtocol umbilical)
    throws IOException, InterruptedException, ClassNotFoundException {
    this.umbilical = umbilical;
    job.setBoolean("mapred.skip.on", isSkipping());

    // reduce task需要经历:copy,sort,reduce三个过程
    if (isMapOrReduce()) {
      copyPhase = getProgress().addPhase("copy");
      sortPhase  = getProgress().addPhase("sort");
      reducePhase = getProgress().addPhase("reduce");
    }
    // start thread that will handle communication with parent
    TaskReporter reporter = new TaskReporter(getProgress(), umbilical);
    reporter.startCommunicationThread();
    boolean useNewApi = job.getUseNewReducer();
    initialize(job, getJobID(), reporter, useNewApi);

    // check if it is a cleanupJobTask
    if (jobCleanup) {
      runJobCleanupTask(umbilical, reporter);
      return;
    }
    if (jobSetup) {
      runJobSetupTask(umbilical, reporter);
      return;
    }
    if (taskCleanup) {
      runTaskCleanupTask(umbilical, reporter);
      return;
    }
    
    // Initialize the codec
    codec = initCodec();

    boolean isLocal = "local".equals(job.get("mapred.job.tracker", "local"));
    if (!isLocal) {
      // ReduceCopier对象负责将Mapper的输出拷贝至Reducer所在机器
      reduceCopier = new ReduceCopier(umbilical, job, reporter);
      if (!reduceCopier.fetchOutputs()) {
        if(reduceCopier.mergeThrowable instanceof FSError) {
          throw (FSError)reduceCopier.mergeThrowable;
        }
        throw new IOException("Task: " + getTaskID() + 
            " - The reduce copier failed", reduceCopier.mergeThrowable);
      }
    }
    copyPhase.complete();                         // copy is already complete
    setPhase(TaskStatus.Phase.SORT);
    statusUpdate(umbilical);

    final FileSystem rfs = FileSystem.getLocal(job).getRaw();
    //根据JobTracker是否在本地来决定调用哪种排序方式
    RawKeyValueIterator rIter = isLocal
      ? Merger.merge(job, rfs, job.getMapOutputKeyClass(),
          job.getMapOutputValueClass(), codec, getMapFiles(rfs, true),
          !conf.getKeepFailedTaskFiles(), job.getInt("io.sort.factor", 100),
          new Path(getTaskID().toString()), job.getOutputKeyComparator(),
          reporter, spilledRecordsCounter, null)
      : reduceCopier.createKVIterator(job, rfs, reporter);
        
    // free up the data structures
    mapOutputFilesOnDisk.clear();
    
    sortPhase.complete();                         // sort is complete
    setPhase(TaskStatus.Phase.REDUCE); 
    statusUpdate(umbilical);
    Class keyClass = job.getMapOutputKeyClass();
    Class valueClass = job.getMapOutputValueClass();
    RawComparator comparator = job.getOutputValueGroupingComparator();

    if (useNewApi) {
      runNewReducer(job, umbilical, reporter, rIter, comparator, 
                    keyClass, valueClass);
    } else {
      runOldReducer(job, umbilical, reporter, rIter, comparator, 
                    keyClass, valueClass);
    }
    done(umbilical, reporter);
  }

  @SuppressWarnings("unchecked")
  private <INKEY,INVALUE,OUTKEY,OUTVALUE>
  void runNewReducer(JobConf job,
                     final TaskUmbilicalProtocol umbilical,
                     final TaskReporter reporter,
                     RawKeyValueIterator rIter,
                     RawComparator<INKEY> comparator,
                     Class<INKEY> keyClass,
                     Class<INVALUE> valueClass
                     ) throws IOException,InterruptedException, 
                              ClassNotFoundException {
    // wrap value iterator to report progress.
    final RawKeyValueIterator rawIter = rIter;
    rIter = new RawKeyValueIterator() {
      public void close() throws IOException {
        rawIter.close();
      }
      public DataInputBuffer getKey() throws IOException {
        return rawIter.getKey();
      }
      public Progress getProgress() {
        return rawIter.getProgress();
      }
      public DataInputBuffer getValue() throws IOException {
        return rawIter.getValue();
      }
      public boolean next() throws IOException {
        boolean ret = rawIter.next();
        reducePhase.set(rawIter.getProgress().get());
        reporter.progress();
        return ret;
      }
    };
    // make a task context so we can get the classes
    org.apache.hadoop.mapreduce.TaskAttemptContext taskContext =
      new org.apache.hadoop.mapreduce.TaskAttemptContext(job, getTaskID());
    // make a reducer
    org.apache.hadoop.mapreduce.Reducer<INKEY,INVALUE,OUTKEY,OUTVALUE> reducer =
      (org.apache.hadoop.mapreduce.Reducer<INKEY,INVALUE,OUTKEY,OUTVALUE>)
        ReflectionUtils.newInstance(taskContext.getReducerClass(), job);
     org.apache.hadoop.mapreduce.RecordWriter<OUTKEY,OUTVALUE> trackedRW = 
       new NewTrackingRecordWriter<OUTKEY, OUTVALUE>(reduceOutputCounter,
         job, reporter, taskContext);
    job.setBoolean("mapred.skip.on", isSkipping());
    org.apache.hadoop.mapreduce.Reducer.Context 
         reducerContext = createReduceContext(reducer, job, getTaskID(),
                                               rIter, reduceInputKeyCounter,
                                               reduceInputValueCounter, 
                                               trackedRW, committer,
                                               reporter, comparator, keyClass,
                                               valueClass);
    reducer.run(reducerContext);
    trackedRW.close(reducerContext);
  }

  /**
   * Called once at the start of the task.
   */
  protected void setup(Context context
                       ) throws IOException, InterruptedException {
    // NOTHING
  }

  /**
   * This method is called once for each key. Most applications will define
   * their reduce class by overriding this method. The default implementation
   * is an identity function.
   */
  @SuppressWarnings("unchecked")
  protected void reduce(KEYIN key, Iterable<VALUEIN> values, Context context
                        ) throws IOException, InterruptedException {
    for(VALUEIN value: values) {
      context.write((KEYOUT) key, (VALUEOUT) value);
    }
  }

  /**
   * Called once at the end of the task.
   */
  protected void cleanup(Context context
                         ) throws IOException, InterruptedException {
    // NOTHING
  }

  /**
   * Advanced application writers can use the 
   * {@link #run(org.apache.hadoop.mapreduce.Reducer.Context)} method to
   * control how the reduce task works.
   */
  public void run(Context context) throws IOException, InterruptedException {
    setup(context);
    while (context.nextKey()) {
      reduce(context.getCurrentKey(), context.getValues(), context);
    }
    cleanup(context);
  }
}
与Mapper.run()方法类似。该函数也是先调用setup函数,该函数默认是什么都不做,但是用户可以通过重写此函数来在运行reduce函数之前做一些初始化工作。
然后程序会不断读取输入数据,交给reduce函数处理。这里的reduce函数就是用户自定义的的reduce函数。
最后调用cleanup函数。默认的cleanup函数是没有做任何事情,但是用户可以通过重写此函数来进行一些收尾工作。















评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值