一 流的角度:
Input --> InputFormat --> Mapper —>Shuffle —> Reducer —> OutputFormat —>Output
二 不同的阶段:
Map —> Reduce
MapTask ----> ReduceTask
MapTask —> Shuffle(MapTask的后半部分+ReduceTask的前半部分) —> ReduceTask
三 在源码的角度: map —> sort —> copy —> sort —> reduce
MapTask: map —> sort
mapPhase = getProgress().addPhase(“map”, 0.667f);
sortPhase = getProgress().addPhase(“sort”, 0.333f);
ReduceTask: copy —> sort —> reduce
copyPhase = getProgress().addPhase(“copy”);
sortPhase = getProgress().addPhase(“sort”);
reducePhase = getProgress().addPhase(“reduce”);