2.MapReduce排序

1.源文件获取数据

2.map将读到的每行数据,一行行写给reduce

public class SortMapper extends Mapper<LongWritable, Text, Text, NullWritable> {

    @Override
    protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, NullWritable>.Context context) throws IOException, InterruptedException {
        context.write(value, NullWritable.get());
    }
}

3.设置排序规则

public class SortComparator extends WritableComparator {
    public SortComparator() {
        super(Text.class, true);
    }

    @Override
    public int compare(WritableComparable a, WritableComparable b) {
        Text left = (Text)a;
        Text right = (Text)b;
        return right.compareTo(left);
    }
}

4.reduce将数据写出到文件

public class SortReducer extends Reducer<Text, NullWritable, Text, NullWritable> {
    @Override
    protected void reduce(Text key, Iterable<NullWritable> values, Reducer<Text, NullWritable, Text, NullWritable>.Context context) throws IOException, InterruptedException {
        //        相同的key 不去重
//        for (NullWritable value : values) {
//            context.write(key, NullWritable.get());
//        }
        context.write(key, NullWritable.get());
    }
}

5.SortLaunch驱动

public class SortLaunch {
    public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
        BasicConfigurator.configure();
        Job job = Job.getInstance();
        job.setJobName("sort");
        job.setJarByClass(SortLaunch.class);

        job.setMapperClass(SortMapper.class);
        job.setReducerClass(SortReducer.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(NullWritable.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(NullWritable.class);

        job.setSortComparatorClass(SortComparator.class);

        FileInputFormat.addInputPath(job, new Path("D:\\stu\\code\\hadoop_project\\sort1\\input"));

        Path out = new Path("D:\\stu\\code\\hadoop_project\\sort1\\output");

        // 获取文件管理目录
        FileSystem fs = FileSystem.get(job.getConfiguration());

        if (fs.exists(out)){
            fs.delete(out, true);
        }

        FileOutputFormat.setOutputPath(job, out);

        // 设置 Reduce 任务数量
        // 如果要对整个数据集进行全局排序,可以将 Reduce 任务的数量设置为 1
        job.setNumReduceTasks(1);
        // 提交
        job.waitForCompletion(true);
    }
}
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值