hadoop中conbine的简单使用《转》

combine函数把一个map函数产生的<key,value>对(多个key, value)合并成一个新的<key2,value2>. 将新的<key2,value2>作为输入到reduce函数中。其格式与reduce函数相同。
例如:将3个文件中的数值相加。
file1: 1 2 3
file2: 4 5 6
file3: 7 8 9

public class MyMapre06 {
    public static class Map extends MapReduceBase implements
            Mapper<LongWritable, Text, Text, Text> {

        private Text word = new Text();
        private Text val = new Text();

        public void map(LongWritable key, Text value,
                OutputCollector<Text, Text> output, Reporter reporter)
                throws IOException {
            String line = value.toString();
            String bignum = new StringBuffer(line).toString();

            word.set("1");
            val.set(bignum);
            output.collect(word, val);
        }

    }

    public static class Reduce extends MapReduceBase implements
            Reducer<Text, Text, Text, Text> {
        public void reduce(Text key, Iterator<Text> values,
                OutputCollector<Text, Text> output, Reporter reporter)
                throws IOException {
            BigInteger num = BigInteger.valueOf(0);
            String tmp = new String();
            Text v = new Text();

            while (values.hasNext()) // 计算同一个key下,所有value的总和
            {
                tmp = values.next().toString();
                num = num.add(new BigInteger(tmp));
            }

            String res = new StringBuffer(num.toString()).toString();
            v.set(res);
            output.collect(key, v); // 收集reduce输出结果
        }
    }

    public static class Combiner extends MapReduceBase implements
            Reducer<Text, Text, Text, Text> {
        public void reduce(Text key, Iterator<Text> values,
                OutputCollector<Text, Text> output, Reporter reporter)
                throws IOException {
            BigInteger num = BigInteger.valueOf(0);
            String tmp = new String();
            Text v = new Text();

            while (values.hasNext()) // 计算同一个key下,所有value的总和
            {
                tmp = values.next().toString();
                num = num.add(new BigInteger(tmp));
            }

            v.set(num.toString());
            output.collect(key, v); // 收集reduce输出结果
        }
    }

    public static void main(String[] args) throws Exception {

        JobConf conf = new JobConf(MyMapre06.class);
        conf.setJobName("Sum");

        conf.setOutputKeyClass(Text.class);
        conf.setOutputValueClass(Text.class);

        conf.setMapperClass(Map.class);
        conf.setCombinerClass(Combiner.class);   //使用combiner函数   
        conf.setReducerClass(Reduce.class);

        conf.setInputFormat(TextInputFormat.class);
        conf.setOutputFormat(TextOutputFormat.class);

        FileInputFormat.setInputPaths(conf, new Path(args[0]));
        FileOutputFormat.setOutputPath(conf, new Path(args[1]));

        JobClient.runJob(conf);
    }
}

经过 Combiner函数, file1 为  6, file2 为 15, file3 为 24
进过 Reduce函数, 输出 key 为 1 value 为 35

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值