自定义输出
默认输出:
- FileOutputFormat
- TextOutputFormat
- RecordWriter
- LineRecordWriter
- RecordWriter
- TextOutputFormat
自定义输出:
- 创建一个类继承FileOutputFormat
重写getRecordWriter - 创建一个文件真正的写入器,继承RecordRecordWriter
重写write() close() - job中指定自定义的输出类
job.setOutputFormatClass(MyFileOutputFormat.class);
案例:按学生平均成绩及格和不及格输出到不同文件
computer,huangxiaoming,85
computer,xuzheng,54
computer,huangbo,86
computer,liutao,85
computer,huanglei,99
computer,huangxiaoming,85
computer,xuzheng,54
computer,huangbo,86
computer,liujialing,45
computer,liuyifei,75
computer,huangdatou,48
computer,huangjiaju,88
computer,huangzitao,85
MyFileOutputFormat.java
/**
* 泛型:reduce端输出的key,value
*/
public class MyFileOutputFormat extends FileOutputFormat<Text, DoubleWritable> {
/**
* @param job 上下文对象
*/
public RecordWriter<Text, DoubleWritable> getRecordWriter(TaskAttemptContext job) throws IOException, InterruptedException {
//获取文件系统,向fs中写
FileSystem fs = FileSystem.get(job.getConfiguration());
return new MyRecordWriter(fs);
}
}
MyRecordWriter.java
public class MyRecordWriter extends RecordWriter<Text, DoubleWritable> {
FileSystem fs;
FSDataOutputStream fsDataOutputStream1;
FSDataOutputStream fsDataOutputStream2;
public MyRecordWriter(FileSystem fs) throws IOException {
this.fs = fs;
fsDataOutputStream1 = fs.create(