1、Task运行过程概述
在MapReduce计算框架中,一个应用程序被划分成Map和Reduce两个计算阶段,它们分别由一个或者多个Map Task和Reduce Task组成。其中,每个Map Task处理输入数据集合中的一片数据(InputSplit),并将产生的若干个数据片段写到本地磁盘上,而Reduce Task则从每个Map Task上远程拷贝相应的数据片段,经分组聚集和归约后,将结果写到HDFS上作为最终结果,如下图所示:
总体上看,Map Task与Reduce Task之间的数据传输采用了pull模型。为了能够容错,Map Task将中间计算结果存放到本地磁盘上,而Reduce Task则通过Http请求从各个Map Task端拖取(pull)相应的输入数据。为了更好地支持大量Reduce Task并发从Map Task端拷贝数据,Hadoop采用了Jetty Server作为HTTP Server处理并发数据读请求。
对于Map Task而言,它的执行过程可概述为:首先,通过用户提供的InputFormat将对应的InputSplit解析成一系列key/value,并依次交给用户编写的map()函数处理;接着按照指定的Partitioner对数据分片,以确定每个key/value将交给哪个Reduce Task处理;之后将数据交给用户定义的Combiner进行一次本地规约(用户没有定义则直接跳过);最后将处理结果保存到本地磁盘上。
对于Reduce Task而言,由于它的输入数据来自各个Map Task,因此首先需要通过HTTP请求从各个已经运行完成的Map Task上拷贝对应的数据分片,待所有数据拷贝完成后,再以key为关键字对所有数据进行排序,通过排序,key 相同的记录聚集到一起形成若干分组,然后将每组数据交给用户编写的reduce()函数处理,并将数据结果直接写到HDFS上作为最终输出结果。
这里插一句,当我在想这个流程的时候,有几个问题一直困扰着我。。。
Map Task如何启动的?
Map Task的输入从哪里来?
Reduce Task如何知道自己处理哪些数据?
自己查阅了相关资料,也有一些自己的理解,如果不正确,希望指出来。。。
1、首先Map Task如何启动
实际上,TaskTracker初始化时,会初始化并启动两个TaskLauncher类型的线程,mapLauncher,reduceLauncher。在TaskTracker从JobTracher获取到任务后,对应的会把任务添加到两个TaskLauncher的Queue中。
在org.apache.hadoop.mapred.TaskTracker.java中的initialize()函数中初始化并启动这两个线程
mapLauncher = new TaskLauncher(TaskType.MAP, maxMapSlots);
reduceLauncher = new TaskLauncher(TaskType.REDUCE, maxReduceSlots);
mapLauncher.start();
reduceLauncher.start();
TaskLauncher是一个线程,一直检查tasksToLaunch列表中是否有任务,有的情况下取出执行。
TaskLauncher是org.apache.hadoop.mapred.TaskTracker.java的一个内部类。。。
class TaskLauncher extends Thread {
private IntWritable numFreeSlots;
private final int maxSlots;
private List<TaskInProgress> tasksToLaunch;
public TaskLauncher(TaskType taskType, int numSlots) {
this.maxSlots = numSlots;
this.numFreeSlots = new IntWritable(numSlots);
this.tasksToLaunch = new LinkedList<TaskInProgress>();
setDaemon(true);
setName("TaskLauncher for " + taskType + " tasks");
}
public void addToTaskQueue(LaunchTaskAction action) {
synchronized (tasksToLaunch) {
TaskInProgress tip = registerTask(action, this);
tasksToLaunch.add(tip);
tasksToLaunch.notifyAll();
}
}
public void cleanTaskQueue() {
tasksToLaunch.clear();
}
public void addFreeSlots(int numSlots) {
synchronized (numFreeSlots) {
numFreeSlots.set(numFreeSlots.get() + numSlots);
assert (numFreeSlots.get() <= maxSlots);
LOG.info("addFreeSlot : current free slots : " + numFreeSlots.get());
numFreeSlots.notifyAll();
}
}
void notifySlots() {
synchronized (numFreeSlots) {
numFreeSlots.notifyAll();
}
}
int getNumWaitingTasksToLaunch() {
synchronized (tasksToLaunch) {
return tasksToLaunch.size();
}
}
public void run() {
while (!Thread.interrupted()) {
try {
TaskInProgress tip;
Task task;
synchronized (tasksToLaunch) {
while (tasksToLaunch.isEmpty()) {
tasksToLaunch.wait();
}
//get the TIP
tip = tasksToLaunch.remove(0);
task = tip.getTask();
LOG.info("Trying to launch : " + tip.getTask().getTaskID() +
" which needs " + task.getNumSlotsRequired() + " slots");
}
//wait for free slots to run
synchronized (numFreeSlots) {
boolean canLaunch = true;
while (numFreeSlots.get() < task.getNumSlotsRequired()) {
//Make sure that there is no kill task action for this task!
//We are not locking tip here, because it would reverse the
//locking order!
//Also, Lock for the tip is not required here! because :
// 1. runState of TaskStatus is volatile
// 2. Any notification is not missed because notification is
// synchronized on numFreeSlots. So, while we are doing the check,
// if the tip is half way through the kill(), we don't miss
// notification for the following wait().
if (!tip.canBeLaunched()) {
//got killed externally while still in the launcher queue
LOG.info("Not blocking slots for " + task.getTaskID()
+ " as it got killed externally. Task's state is "
+ tip.getRunState());
canLaunch = false;
break;
}
LOG.info("TaskLauncher : Waiting for " + task.getNumSlotsRequired() +
" to launch " + task.getTaskID() + ", currently we have " +
numFreeSlots.get() + " free slots");
numFreeSlots.wait();
}
if (!canLaunch) {
continue;
}
LOG.info("In TaskLauncher, current free slots : " + numFreeSlots.get()+
" and trying to launch "+tip.getTask().getTaskID() +
" which needs " + task.getNumSlotsRequired() + " slots");
numFreeSlots.set(numFreeSlots.get() - task.getNumSlotsRequired());
assert (numFreeSlots.get() >= 0);
}
synchronized (tip) {
//to make sure that there is no kill task action for this
if (!tip.canBeLaunched()) {
//got killed externally while still in the launcher queue
LOG.info("Not launching task " + task.getTaskID() + " as it got"
+ " killed externally. Task's state is " + tip.getRunState());
addFreeSlots(task.getNumSlotsRequired());
continue;
}
tip.slotTaken = true;
}