_00008 Hadoop TaskTracker源码浅析

最新推荐文章于 2019-09-09 15:43:17 发布

那伊抹微笑

最新推荐文章于 2019-09-09 15:43:17 发布

阅读量1k

点赞数 1

分类专栏： hadoop hadoop 源码分析文章标签： hadoop tasktracker tasktracker源码浅析

本文链接：https://blog.csdn.net/u012185296/article/details/20662645

版权

hadoop 同时被 2 个专栏收录

16 篇文章 0 订阅

订阅专栏

hadoop 源码分析

6 篇文章 0 订阅

订阅专栏

博文作者：妳那伊抹微笑
个性签名：世界上最遥远的距离不是天涯，也不是海角，而是我站在妳的面前，妳却感觉不到我的存在
技术方向： Flume+Kafka+Storm+Redis/Hbase+Hadoop+Hive+Mahout+Spark ... 云计算技术
转载声明：可以转载, 但必须以超链接形式标明文章原始出处和作者信息及版权声明，谢谢合作！
qq交流群： 214293307 云计算之嫣然伊笑

（期待与你一起学习，共同进步）

这里是一张客户端的流程图（手贱乱画的）

# taskTracker主要完成以下工作

1负责向JobTracker定期的发送心跳消息。消息中有说明是否要申请新的任务，并接收Job下达的任务。

2 如果jobTracker下达了task任务要执行，则执行该任务。

# 先看看类的注释跟类结构

* TaskTracker is a process that starts and tracks MRTasks

* in a networkedenvironment. It contacts the JobTracker

* for Taskassignments and reporting results.

public class TaskTracker implements MRConstants,TaskUmbilicalProtocol,

Runnable,TaskTrackerMXBean {

TaskTracker是一个在网络环境处理开始和跟踪mapreduce任务（英语不好，只能这么翻译了），它联系着JobTracker为了任务分配和报告结果。

这个类实现了Runnable接口，开启线程，你懂的，哼哼！←_←

# 首先进入taskTracker的main方法，看看它都干了些什么

TaskTracker tt = new TaskTracker(conf);

# 进入

int httpPort =infoSocAddr.getPort();

this.server = newHttpServer("task", httpBindAddress, httpPort,

httpPort ==0, conf, aclsManager.getAdminsAcl());

workerThreads = conf.getInt("tasktracker.http.threads", 40);

server.setThreads(1, workerThreads);

// let the jsppages get to the task tracker, config, and other relevant

// objects

FileSystemlocal = FileSystem.getLocal(conf);

this.localDirAllocator =new LocalDirAllocator("mapred.local.dir");

Class<? extends TaskController>taskControllerClass =

conf.getClass("mapred.task.tracker.task-controller",

DefaultTaskController.class, TaskController.class);

fConf = new JobConf(conf);

localStorage = new LocalStorage(fConf.getLocalDirs());

localStorage.checkDirs();

taskController =

(TaskController)ReflectionUtils.newInstance(taskControllerClass,fConf);

taskController.setup(localDirAllocator,localStorage);

lastNumFailures = localStorage.numFailures();

// create userlog manager

setUserLogManager(newUserLogManager(conf,taskController));

SecurityUtil.login(originalConf,TT_KEYTAB_FILE,TT_USER_NAME);

initialize();

这里开启了一个jetty服务器，web方式访问

# 进入 initialize方法

//set the num handlers to max*2 since canCommit may waitfor the duration

//of a heartbeatRPC

this.taskReportServer = RPC.getServer(this, bindAddress,

tmpPort, 2* max, false,this.fConf,this.jobTokenSecretManager);

this.taskReportServer.start();

开启一个RPC的服务端taskReportServer（谁来调用？）

# 抓取map任务完成的事件，mapLanacher，reduceLanacher开始运行处理mapreduce任务

// start the thread that will fetch map task completionevents

this.mapEventsFetcher =newMapEventsFetcherThread();

mapEventsFetcher.setDaemon(true);

mapEventsFetcher.setName(

"Map-eventsfetcher for all reduce tasks " +"on " +

taskTrackerName);

mapEventsFetcher.start();

mapLauncher =newTaskLauncher(TaskType.MAP,maxMapSlots);

reduceLauncher = newTaskLauncher(TaskType.REDUCE,maxReduceSlots);

mapLauncher.start();

reduceLauncher.start();

# 然后进入 TackTracker的main方法里面,进入tt.run();看看该线程干了些什么（run方法）

public void run() {

try {

getUserLogManager().start();

startCleanupThreads();

boolean denied =false;

while (running && !shuttingDown) {

boolean staleState =false;

try {

// Thiswhile-loop attempts reconnects if we get network errors

while (running &&!staleState && !shuttingDown &&!denied) {

try {

StateosState = offerService();

# Run方法里面主要是offerService();在干事，接下来看offerService();吧！（只看主要的代码）

// If theTaskTracker is just starting up:

// 1. Verify theversions matches with the JobTracker

// 2. Get thesystem directory & filesystem

if(justInited) {

StringjtBuildVersion = jobClient.getBuildVersion();

StringjtVersion = jobClient.getVIVersion();

这里的jobClient是InterTrackerProtocol这样一个接口，JobTracker实现了这个接口，也就是说这里是RPC客户端远程调用了RPC服务端JobTracker的方法了，得到了这些版本，，第二行注释说了要核实核实匹配这些JobTracker的版本

# 重点来了，发送心跳

// Send the heartbeat and process the jobtracker'sdirectives

HeartbeatResponseheartbeatResponse = transmitHeartBeat(now);

# 进入transmitHeartBeat（now）,看看在干嘛

// Check if weshould ask for a new Task

boolean askForNewTask;

longlocalMinSpaceStart;

synchronized (this) {

askForNewTask=

((status.countOccupiedMapSlots()<maxMapSlots ||

status.countOccupiedReduceSlots()<maxReduceSlots) &&

acceptNewTasks);

localMinSpaceStart = minSpaceStart;

}

if (askForNewTask) {

先将自己的状态用对象封装起来，比如taskTrackerName，localHostname，maxMapSlots，maxReduceSlots等。

然后检查taskTracker是否应该寻求一个新任务boolean askForNewTask;

如果map和reduced的个数没有超过最大值，可以接收新任务的基础上，再根据minSpaceStart

的值来确定是否可以领取新的任务。minSpaceStart由mapred.local.dir.minspacestart属性指定。默认为0，如果minSpaceStart的值小于磁盘空闲的空间值，则可以，否则不能。

并记录生成消息的时间。

# 得到了heartbeatResponse这个对象之后，就可以开始任务了，里面有jobid

HeartbeatResponse heartbeatResponse = jobClient.heartbeat(status,

justStarted,

justInited,

askForNewTask,

heartbeatResponseId);

# 在这里得到acitons，得到要处理的东西

TaskTrackerAction[] actions = heartbeatResponse.getActions();

# 循环actions

if (actions !=null){

for(TaskTrackerActionaction: actions) {

if (actioninstanceof LaunchTaskAction){

addToTaskQueue((LaunchTaskAction)action);

} else if (action instanceof CommitTaskAction){

CommitTaskAction commitAction = (CommitTaskAction)action;

if (!commitResponses.contains(commitAction.getTaskID())){

LOG.info("Receivedcommit task action for " +

commitAction.getTaskID());

commitResponses.add(commitAction.getTaskID());

}

} else {

addActionToCleanup(action);

}

markUnresponsiveTasks();

killOverflowingTasks();

# 进入addToTaskQueue((LaunchTaskAction)action);这里就是把map跟reduce任务添加到mapLauncher或者reduceLauncher中去（这两个线程一开始初始化的时候就启动了）

private voidaddToTaskQueue(LaunchTaskAction action) {

if(action.getTask().isMapTask()) {

mapLauncher.addToTaskQueue(action);

} else {

reduceLauncher.addToTaskQueue(action);

}

# 看看mapLauncher是什么类

class TaskLauncherextends Thread {

也就是说mapLauncher是TaskLauncher这个类，这个类继承了Thread，也就是一个线程，下面看看它的run方法

# 看TaskLauncher 的run方法（重点代码）

synchronized (tip) {

//to make surethat there is no kill task action for this

if(!tip.canBeLaunched()) {

//got killedexternally while still in the launcher queue

LOG.info("Notlaunching task " + task.getTaskID() +" as itgot"

+ " killedexternally. Task's state is " +tip.getRunState());

addFreeSlots(task.getNumSlotsRequired());

continue;

}

tip.slotTaken = true;

}

//got a freeslot. launch the task

startNewTask(tip);

这里得到了一个空闲的slots，然后就运行任务了，这里的重点是把任务给运行了（map或者reduce任务）

到了这里基本上taskTracker的工作就完了，后面只是一些其它的事情了

# 除掉在汇报周期内没有回报进展的task，则认定为失败的task，kill掉。

# 检查本节点的磁盘空间是否处于危险阶段。空闲磁盘空间危险的临界值是通过mapred.local.dir.minspacekill

设定的。默认为0，则不做处理，如果不为0，空闲的空间比该值小，则该节点不再接受新的task，

acceptNewTasks设为false，直到所有的task运行完，清理掉。同时会kill掉一个task，假如有多个task在运行，那kill掉那个呢，taskTracker是根据优先级来kill的，优先级最低的会kill掉。

# 检查当前节点是否空闲，如果是空闲而且acceptNewTasks为false，则更新acceptNewTasks为true，判断空闲的根据是tasks.isEmpty() &&tasksToCleanup.isEmpty();

# 总结 TaskTracker

taskTracker 启动一个jetty服务器，启动了一个RPC的服务端，然后调用了run方法开启了一个线程，线程的主要方法是offerService这个方法，发送心跳，寻求一个新任务，得到heartbeatResponse这个对象，里面有任务相关的东西（比如任务id），然后通过TaskTrackerAction[] actions = heartbeatResponse.getActions();得到需要处理的任务，之后循环actions，将任务添加到任务队列中去addToTaskQueue((LaunchTaskAction)action)

这个队列会把action放入一个TaskLauncher类中，该类是一个线程类，运行run方法，就把这个action（任务）给完成了。最后面再做一些其他事情，比如检查当前节点是否为空闲什么的。

妳那伊抹微笑

The you smile until forever 、、、、、、、、、、、、、、、、、、、、、