java多个mapreduce,多输出路径(Java - Hadoop - MapReduce)

I do two MapReduce job, and I want for the second job to be able to write my result into two different files, in two different directories.

I would like something similar to FileInputFormat.addInputPath(.., multiple input path) in a sense, but for the output.

I'm completely new to MapReduce, and I have a specificity to write my code in Hadoop 0.21.0

I use context.write(..) in my Reduce step, but I don't see how to control multiple output paths...

Thanks for your time !

My reduceCode from my first job, to show you I only know how to output (it goes into a /../part* file. But now what I would like is to be able to specify two precises files for different output, depending on the key) :

public static class NormalizeReducer extends Reducer {

public void reduce(LongWritable key, Iterable values, Context context) throws IOException, InterruptedException {

NetflixUser user = new NetflixUser(key.get());

for(NetflixRating r : values) {

user.addRating(new NetflixRating(r));

}

user.normalizeRatings();

user.reduceRatings();

context.write(key, user);

}

}

EDIT: so I did the method in the last comment as you mentioned, Amar. I don't know if it's works, I have other problem with my HDFS, but before I forget let's put here my discoveries for the sake of civilization :

MultipleOutputs DOES NOT act in place of FormatOutputFormat. You define one output path with FormatOutputFormat, and then you can add many more with multiple MultipleOutputs.

addNamedOutput method: String namedOutput is just a word who describe.

You define the path actually in the write method, the String baseOutputPath arg.

解决方案

so I did the method in the last comment as you mentioned, Amar. I don't know if it's works, I have other problem with my HDFS, but before I forget let's put here my discoveries for the sake of civilization :

MultipleOutputs DOES NOT act in place of FormatOutputFormat. You define one output path with FormatOutputFormat, and then you can add many more with multiple MultipleOutputs.

addNamedOutput method: String namedOutput is just a word who describe.

You define the path actually in the write method, the String baseOutputPath arg.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值