MapReduce计算的案例,如下:
数据排序sortDemo:
将sortfile1.txt、sortfile2.txt、sortfile3.txt中的记录整合排序后,输出到一个文件中,包含行号。编写MapReduce程序,实现上述内容:
分析:利用MR的sort能力,必须进行shuffle,一定实现reduce;
1.编写mapper
将<k1,v1>(行偏移量,行值) --> <k2,v2> (行值,1)
2.编写reducer
接收来自mapper端的数据<k2,[v2,v2,v2...]>,此时,数据已经按key进行排序
循环遍历values,context.write(linesum,k2);linesum++;
3.编写dirver
省略!!!
1.Mapper.class
public class SortMapper extends Mapper<LongWritable,Text,LongWritable,LongWritable> {
@Override
protected void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line=value.toString();
context.write(new LongWritable(Integer.parseInt(line)), new LongWritable(1));
}
}
此时<k1,v1>:表示<行偏移量,行值(用Text表示)>;
<k2,v2>:表示<行值LongWritabe类型,1>;
2.Reducer.class
public class SortReducer extends Reducer<LongWritable,LongWritable,LongWritable,LongWritable> {
private static LongWritable linesum = new LongWritable(1);
@Override
protected void reduce(LongWritable key, Iterable<LongWritable> values, Context context)
throws IOException, InterruptedException {
for(LongWritable v:values){
context.write(linesum,key);
linesum = new LongWritable(linesum.get()( + 1);
}
}
}
3.Driver.class-主类
public class SortDriver {
public static void main(String[] args) throws IllegalArgumentException, IOException, ClassNotFoundException, InterruptedException {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf);
job.setJarByClass(SortMapper.class);
job.setJobName("mysort");
job.setMapperClass(SortMapper.class);//输入数据方法
job.setReducerClass(SortReducer.class);//计算结果
job.setOutputKeyClass(LongWritable.class);
job.setOutputValueClass(LongWritable.class);
//mysort包含三个sortfile.txt文件,结果输出到outsort
FileInputFormat.addInputPath(job, new Path("file:///D:/mysort"));
FileOutputFormat.setOutputPath(job, new Path("file:///D:/outsort"));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
4.输出结果 :
1 2 (左边是行号,后边是排序后的数。)
2 3
3 3
4 3
5 3
6 4
7 4
8 4
9 5
10 5
11 5
12 5
13 6
14 7
15 45
16 56
17 56
18 67
19 67
20 67
21 76
22 78
23 78
24 88
25 98
26 123
27 345
28 690
29 988