上文,已经对map的输入和输出做了源码分析,相信已经对map task的流程也都已经十分了解,现在,来分析一下Reduce的输入,因为输出是直接输出到HDFS了,这里不多做阐述。
Reduce Task分为四种,分别为Job-setup Task,Job-cleanup Task, Task-cleanup Task和Reduce Task,这里分析的是最后的普通的Reduce Task
Reduce Task 源码分析
hadoop的版本2.7.2,工具IDEA
我们直接看一下Reduce Task的run方法
public void run(JobConf job, final TaskUmbilicalProtocol umbilical)
throws IOException, InterruptedException, ClassNotFoundException {
job.setBoolean(JobContext.SKIP_RECORDS, isSkipping());
//前面这些先不看
if (isMapOrReduce()) {
copyPhase = getProgress().addPhase("copy");
sortPhase = getProgress().addPhase("sort");
reducePhase = getProgress().addPhase("reduce");
}
// start thread that will handle communication with parent
TaskReporter reporter = startReporter(umbilical);
boolean useNewApi = job.getUseNewReducer();
initialize(job, getJobID(), reporter, useNewApi);
// check if it is a cleanupJobTask
if (jobCleanup) {
runJobCleanupTask(umbilical, reporter);
return;
}
if (jobSetup) {
runJobSetupTask(umbilical, reporter);
return;
}
if (taskCleanup) {
runTaskCleanupTask(umbilical, reporter);
return;
}
// Initialize the codec
codec = initCodec();
RawKeyValueIterator rIter = null;
//Shuffle拉取的插件
ShuffleConsumerPlugin shuffleConsumerPlugin = null;
Class combinerClass = conf.getCombinerClass();
CombineOutputCollector combineCollector =
(null != combinerClass) ?
new CombineOutputCollector(reduceCombineOutputCounter, reporter, conf) : null;
Class<? extends ShuffleConsumerPlugin> clazz =
job.getClass(MRConfig.SHUFFLE_CONSUMER_PLUGIN, Shuffle.class, ShuffleConsumerPlugin.class);
shuffleConsumerPlugin = ReflectionUtils.newInstance(clazz, job);
LOG.info("Using ShuffleConsumerPlugin: " + shuffleConsumerPlugin);
ShuffleConsumerPlugin.Context shuffleContext =
new ShuffleConsumerPlugin.Context(getTaskID(), job, FileSystem.getLocal(job), umbilical,
super.lDirAlloc, reporter, codec,
combinerClass, combineCollector,
spilledRecordsCounter, reduceCombineInputCounter,
shuffledMapsCounter,
reduceShuffleBytes, failedShuffleCounter,
mergedMapOutputsCounter,
taskStatus, copyPhase, sortPhase, this,
mapOutputFile, localMapFiles);
//到这里,我们shuffle插件已经拉取了map输出的数据了,,,,这里的插件被nodemanager集成了
shuffleConsumerPlugin.init(shuffleContext);
//这里是一个迭代器,里面有我们reduce从map拉取的全量数据,,(这里的迭代器则为真迭代器)
rIter = shuffleConsumerPlugin.run();
// free up the data structures
mapOutputFilesOnDisk.clear();
sortPhase.complete(); // sort is complete
setPhase(TaskStatus.Phase.REDUCE);
statusUpdate(umbilical);
Class keyClass = job.getMapOutputKeyClass();
Class valueClass = job.getMapOutputValueClass();
//获取分组比较器,(map阶段的比较器则为快速排序服务,即为排序比较器)
// 我们进去看一下排序比较器
RawComparator comparator = job.getOutputValueGroupingComparator();
if (useNewApi) {
runNewReducer(job, umbilical, reporter, rIter, comparator,
keyClass, valueClass);
} else {
runOldReducer(job, umbilical, reporter, rIter, comparator,
keyClass, valueClass);
}
shuffleConsumerPlugin.close();
done(umbilical, reporter);
}
我们进到分组比较器(getOutputValueGroupingComparator)里面看看
public RawComparator getOutputValueGroupingComparator() {
Class<? extends RawComparator> theClass = getClass(
//如果用户配置了分组比较器,则取用户配置的,
JobContext.GROUP_COMPARATOR_CLASS, null, RawComparator.class);
if (theClass == null) {
// 我们进到面看一下,默认获取的分组比较器是什么
return getOutputKeyComparator();
}
//去到则直接反射创建
return ReflectionUtils.newInstance(theClass, this);
}
进到getOutputKeyComparator里面看看,到底获取的是分组什么比较器
public RawComparator getOutputKeyComparator() {
Class<? extends RawComparator> theClass = getClass(
JobContext.KEY_COMPARATOR, null, RawComparator.class);
if (theClass != null)
//如果用户配置了,则反射创建用户配置的
return ReflectionUtils.newInstance(theClass, this);
return
// 如果没有取到,则获取的是key比较器,,,也就是我们map阶段的排序比较器
// 这里我们可以知道,reduce端分组的比较器如果用户没有设置分组比较器,则取的是我们map端的比较器
WritableComparator.get(getMapOutputKeyClass().asSubclass(WritableComparable.class), this);
}
我们回去继续分析,进到runNewReducer里面
private <INKEY,INVALUE,OUTKEY,OUTVALUE>
void runNewReducer(JobConf job,
final TaskUmbilicalProtocol umbilical,
final TaskReporter reporter,
//真迭代器
RawKeyValueIterator rIter,
//比较器
RawComparator<INKEY> comparator,
Class<INKEY> keyClass,
Class<INVALUE> valueClass
) throws IOException,InterruptedException,
ClassNotFoundException {
// wrap value iterator to report progress.
final RawKeyValueIterator rawIter = rIter;
// 下面是对这个真迭代器包装了一下
rIter = new RawKeyValueIterator() {
public void close() throws IOException {
rawIter.close();
}
public DataInputBuffer getKey() throws IOException {
return rawIter.getKey();
}
public Progress getProgress() {
return rawIter.getProgress();
}
public DataInputBuffer getValue() throws IOException {
return rawIter.getValue();
}
public boolean next() throws IOException {
boolean ret = rawIter.next();
reporter.setProgress(rawIter.getProgress().getProgress());
return ret;
}
};
// make a task context so we can get the classes
// 准备taskContext
org.apache.hadoop.mapreduce.TaskAttemptContext taskContext =
new org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl(job,
getTaskID(), reporter);
// make a reducer
// 通过反射获取我们配置的Reducer
org.apache.hadoop.mapreduce.Reducer<INKEY,INVALUE,OUTKEY,OUTVALUE> reducer =
(org.apache.hadoop.mapreduce.Reducer<INKEY,INVALUE,OUTKEY,OUTVALUE>)
ReflectionUtils.newInstance(taskContext.getReducerClass(), job);
//reduce的输出先不看
org.apache.hadoop.mapreduce.RecordWriter<OUTKEY,OUTVALUE> trackedRW =
new NewTrackingRecordWriter<OUTKEY, OUTVALUE>(this, taskContext);
job.setBoolean("mapred.skip.on", isSkipping());
job.setBoolean(JobContext.SKIP_RECORDS, isSkipping());
//这里创建了一个reduce的上下文,讲真迭代器,比较器都传进去,我们进去看一下实现类
org.apache.hadoop.mapreduce.Reducer.Context
reducerContext = createReduceContext(reducer, job, getTaskID(),
rIter, reduceInputKeyCounter,
reduceInputValueCounter,
trackedRW,
committer,
reporter, comparator, keyClass,
valueClass);
try {
reducer.run(reducerContext);
} finally {
trackedRW.close(reducerContext);
}
}
接下来我们进去看一下怎么创建的reducerContext,我们进到它的实现类ReduceContextImpl里面
public ReduceContextImpl(Configuration conf, TaskAttemptID taskid,
//这里将我们的迭代器改名为了input
RawKeyValueIterator input,
Counter inputKeyCounter,
Counter inputValueCounter,
RecordWriter<KEYOUT,VALUEOUT> output,
OutputCommitter committer,
StatusReporter reporter,
//比较器
RawComparator<KEYIN> comparator,
Class<KEYIN> keyClass,
Class<VALUEIN> valueClass
) throws InterruptedException, IOException{
super(conf, taskid, output, committer, reporter);
// 这里我们的真迭代器被input指向,就很多会用,
// 我们在reduce会经常调用的方法是nextkey,我们进到里面看看
this.input = input;
this.inputKeyCounter = inputKeyCounter;
this.inputValueCounter = inputValueCounter;
//比较器赋赋值
this.comparator = comparator;
//准备反序列化的一些东西
// 我们的map输出是序列化的字节数组文件,被reduce拉走后肯定要反序列化,这里是准备
this.serializationFactory = new SerializationFactory(conf);
this.keyDeserializer = serializationFactory.getDeserializer(keyClass);
this.keyDeserializer.open(buffer);
this.valueDeserializer = serializationFactory.getDeserializer(valueClass);
this.valueDeserializer.open(buffer);
//迭代器的下一个还有没有数据,返回boolean
hasMore = input.next();
//获取配置
this.keyClass = keyClass;
this.valueClass = valueClass;
this.conf = conf;
this.taskid = taskid;
}
我们进到nextKey里面
//我们先小总结一下,在reduce端,只要任务启动完了,拉完数据了,真迭代器准备好了
// 下面就会调用我们的Reduce Task的run方法了,已调用run方法,就开始while判断nextKey了
public boolean nextKey() throws IOException,InterruptedException {
// nextKeyIsSame(下一个key和我是不是一组)默认值为false,
// 刚开始,我们的hasMore肯定是true,第一次则不进到这里面
while (hasMore && nextKeyIsSame) {
nextKeyValue();
}
//如果有值
if (hasMore) {
//累加器的
if (inputKeyCounter != null) {
inputKeyCounter.increment(1);
}
//最终返回的是nextKeyValue方法,不过我们这个nextKeyValue和map阶段的实现不一样,我们进去看看
return nextKeyValue();
} else {
return false;
}
}
我们进到nextKeyValue看一看,具体是怎么实现的
public boolean nextKeyValue() throws IOException, InterruptedException {
//这里,如果迭代器里有数据,则不执行
// 没有取到返回false
if (!hasMore) {
key = null;
value = null;
return false;
}
//nextKeyIsSame(下一个key和我是不是一组)在定义的时候就是false,这里取反,则firstValue为真
firstValue = !nextKeyIsSame;
//获取key的字节数组
DataInputBuffer nextKey = input.getKey();
currentRawKey.set(nextKey.getData(), nextKey.getPosition(),
nextKey.getLength() - nextKey.getPosition());
buffer.reset(currentRawKey.getBytes(), 0, currentRawKey.getLength());
//反序列化key,并对key赋值
key = keyDeserializer.deserialize(key);
DataInputBuffer nextVal = input.getValue();
buffer.reset(nextVal.getData(), nextVal.getPosition(), nextVal.getLength()
- nextVal.getPosition());
//反序列化value,并对value赋值,这里的逻辑和map差不多,都是对kv赋值
value = valueDeserializer.deserialize(value);
currentKeyLength = nextKey.getLength() - nextKey.getPosition();
currentValueLength = nextVal.getLength() - nextVal.getPosition();
if (isMarked) {
backupStore.write(nextKey, nextVal);
}
//更新hasMore,看一下迭代器里除了上面那条之外,还有没有第二条
hasMore = input.next();
//如果还有第二条
if (hasMore) {
//把第二条拿出来
nextKey = input.getKey();
// 这里,分组比较器,前两个个参数代表的是上一条记录,nextKey是下一条记录
// 如果上一个key和下一个key值如果等于0,则代表相等
// nextKeyIsSame则为真
// 这里是对下一条记录做预判断
nextKeyIsSame = comparator.compare(currentRawKey.getBytes(), 0,
currentRawKey.getLength(),
nextKey.getData(),
nextKey.getPosition(),
nextKey.getLength() - nextKey.getPosition()
) == 0;
} else {
nextKeyIsSame = false;
}
inputValueCounter.increment(1);
//如果取到了,就返回true
return true;
}
这里,我们知道了run方法的nextKey主要做了两件事情,1.为我们的kv赋值,2.对下一条数据做预判断
这两件事情做完之后,只要我们的kv赋值成功了,则会调用我们的reduce方法,我们通过run方法也得知,当nextKey为真,则会从上下文中调用getCurrentKey和getvalues,我们看一下getCurrentKey这个方法
//直接返回我们的key了,没什么逻辑
public KEYIN getCurrentKey() {
return key;
}
我们在看看getValues这个方法看一下
public
Iterable<VALUEIN> getValues() throws IOException, InterruptedException {
return iterable;
}
返回了一个迭代器,这个迭代器是ValueIterable类型的,我们进到里面发现它创建了一个迭代器,我们再进到ValueIterator里面
//这个迭代器是最终干活的迭代器
protected class ValueIterator implements ReduceContext.ValueIterator<VALUEIN> {
private boolean inReset = false;
private boolean clearMarkFlag = false;
//判断还有没有记录
@Override
public boolean hasNext() {
try {
if (inReset && backupStore.hasNext()) {
return true;
}
} catch (Exception e) {
e.printStackTrace();
throw new RuntimeException("hasNext failed", e);
}
//判断返回值,要么用这条记录,要么用nextKeyIsSame
// 这里我们知道了,run方法调用getValues返回迭代器
// 迭代器里的hasNext的条件就是判断下一条记录是否还相同
return firstValue || nextKeyIsSame;
}
//取值
@Override
public VALUEIN next() {
if (inReset) {
try {
if (backupStore.hasNext()) {
backupStore.next();
DataInputBuffer next = backupStore.nextValue();
buffer.reset(next.getData(), next.getPosition(), next.getLength()
- next.getPosition());
value = valueDeserializer.deserialize(value);
return value;
} else {
inReset = false;
backupStore.exitResetMode();
if (clearMarkFlag) {
clearMarkFlag = false;
isMarked = false;
}
}
} catch (IOException e) {
e.printStackTrace();
throw new RuntimeException("next value iterator failed", e);
}
}
// if this is the first record, we don't need to advance
if (firstValue) {
firstValue = false;
return value;
}
// if this isn't the first record and the next key is different, they
// can't advance it here.
if (!nextKeyIsSame) {
throw new NoSuchElementException("iterate past last value");
}
// otherwise, go to the next key/value pair
try {
//这里这个迭代器取值调用的方法是 nextKeyValue也是上面分析的
// 这里我们得知,,我们在reduce端编写获得的迭代器就是这个迭代器
// 为假迭代器,因为在这个取值的方法里面,又调用了nextKeyValue,这个是我们的上文分析的,是input,
// 而input在上面是rIter改名后的
nextKeyValue();
return value;
} catch (IOException ie) {
throw new RuntimeException("next value iterator failed", ie);
} catch (InterruptedException ie) {
// this is bad, but we can't modify the exception list of java.util
throw new RuntimeException("next value iterator interrupted", ie);
}
}
......
}
分析到这里,关于reduce迭代器是怎么迭代数据的,核心源码已经看完了,剩下的任务源码就不做分析了
,这里写一个伪代码总结一下
protected void reduce(Text key, Iterable<xxxx> values, Context context) throws IOException, InterruptedException {
xxxx xxxx = values.iterator(); //假迭代器
while(xxxx.hasNext){ //xxxx.hasNext -- > nextKeyIsSame
xxxx.next() // xxxx.next--> nextKeyValue ---> input(真迭代器)
}
}
input这个真的迭代器是可以一直迭代,直到把数据给迭代完
而我们的reduce的这个迭代器,只能迭代一组数据,因为我们的nextKeyIsSame
我们的nextKeyIsSame是nextKeyValue来更新的
总结一下,reducer当中先调while循环,条件是nextKey,nextKey调的是nextKeyValue,先对kv做了赋值
然后做nextKeyIsSame预判断,判断完之后调reduce方法,调用reduce方法会将假迭代器传进去,最终我们用到的也就是假迭代器,因为上一步已经做了预判断,并且赋值了。在我们程序的while循环中,如果有多条记录,那么hasNext一定为真,第一条记录也是肯定为真,能进到循环里,进到循环里的时候,会调用next把我们的值取出来,先会调用nextKeyValue,调用nextKeyValue又会更新我们的nextKeyIsSame,然后取值,取值完成之后,循环回来,在while循环内又是用nextKeyIsSame来做判定的,如果第二条是一组的,就重复上面的取值,继续迭代,如果迭代到一组的边界了,下一组的数据肯定不一样了,所以在这次取数据的时候,调用nextKeyValue把下一条的预判断更新为false,所以这一条取完值之后,在循环,值就是false了,这一条就不迭代了,所以在我们在这个假迭代器中,只能迭代一组数据。
到此为止,我们的MapReduce的Reduce Task的核心源码就分析完成,
(Shuffle阶段和Merge阶段是并行进行的。当远程拷贝数据量达到一定阈值后,便会触发相应的合并线程对数据进行合并。这两个阶段均是由类ReduceCopier实现的,该类大约包含2 200行代码(整个ReduceTask类才2 900行左右),所以这里我们就不在分析数据拉取的源码了。)