内容提纲
1) MapReduce中的基类Mapper类,自定义Mapper类的父类。
2) MapReduce中的基类Reducer类,自定义Reducer类的父类。
1、Mapper类
API文档
1) InputSplit输入分片,InputFormat输入格式化
2) 对Mapper输出结果进行Sorted排序和Group分组
3) 对Mapper输出结果依据Reducer个数进行分区Patition
4) 对Mapper输出数据进行Combiner
- 在Hadoop官方文档的Mapper类说明:
Maps input key/value pairs to a set of intermediate key/value pairs.
Maps are the individual tasks which transform input records into a intermediate records. The transformed intermediate records need not be of the same type as the input records. A given input pair may map to