最近看了下TT的代码,其中关于TaskLauncher是核心。 中间发现有一段代码很有意思:
private synchronized void releaseSlot() {
if (slotTaken) {
if (launcher != null) {
launcher.addFreeSlots(task.getNumSlotsRequired());
}
slotTaken = false;
} else {
// wake up the launcher. it may be waiting to block slots for this task.
if (launcher != null) {
launcher.notifySlots();
}
}
}
Launcher的notifySlots 函数:
void notifySlots() {
synchronized (numFreeSlots) {
numFreeSlots.notifyAll();
}
}
在TaskLauncher的run()方法中如下:
public void run() {
while (!Thread.interrupted()) {
try {
TaskInProgress tip;
Task task;
synchronized (tasksToLaunch) {
while (tasksToLaunch.isEmpty()) {
tasksToLaunch.wait();
}
//get the TIP
tip = tasksToLaunch.remove(0);
task = tip.getTask();
LOG.info("Trying to launch : " + tip.getTask().getTaskID() +
" which needs " + task.getNumSlotsRequired() + " slots");
}
//wait for free slots to run
synchronized (numFreeSlots) {
boolean canLaunch = true;
while (numFreeSlots.get() < task.getNumSlotsRequired()) {
//Make sure that there is no kill task action for this task!
//We are not locking tip here, because it would reverse the
//locking order!
//Also, Lock for the tip is not required here! because :
// 1. runState of TaskStatus is volatile
// 2. Any notification is not missed because notification is
// synchronized on numFreeSlots. So, while we are doing the check,
// if the tip is half way through the kill(), we don't miss
// notification for the following wait().
if (!tip.canBeLaunched()) {
//got killed externally while still in the launcher queue
LOG.info("Not blocking slots for " + task.getTaskID()
+ " as it got killed externally. Task's state is "
+ tip.getRunState());
canLaunch = false;
break;
}
LOG.info("TaskLauncher : Waiting for " + task.getNumSlotsRequired() +
" to launch " + task.getTaskID() + ", currently we have " +
numFreeSlots.get() + " free slots");
numFreeSlots.wait();
}
if (!canLaunch) {
continue;
}
LOG.info("In TaskLauncher, current free slots : " + numFreeSlots.get()+
" and trying to launch "+tip.getTask().getTaskID() +
" which needs " + task.getNumSlotsRequired() + " slots");
numFreeSlots.set(numFreeSlots.get() - task.getNumSlotsRequired());
assert (numFreeSlots.get() >= 0);
}
synchronized (tip) {
//to make sure that there is no kill task action for this
if (!tip.canBeLaunched()) {
//got killed externally while still in the launcher queue
LOG.info("Not launching task " + task.getTaskID() + " as it got"
+ " killed externally. Task's state is " + tip.getRunState());
addFreeSlots(task.getNumSlotsRequired());
continue;
}
tip.slotTaken = true;
}
//got a free slot. launch the task
startNewTask(tip);
} catch (InterruptedException e) {
return; // ALL DONE
} catch (Throwable th) {
LOG.error("TaskLauncher error " +
StringUtils.stringifyException(th));
}
}
}
}
对于notifySlots函数,我的奇怪之处在于, 根据TaskTracker的构造函数而言, 无论是mapLauncher还是reduceLauncher,都只有一个,换句话说,在numFreeSlots这个变量上面监听和睡眠的线程也只有一个,那为啥还要用notifyAll()函数呢? 真是让人费解啊!
吐槽下: csdn的博客编写界面真难用啊...