在Hadoop中实现Map/Reduce模型,我们需要继承如下两个类:
public class MaxTemperatureMapper extends MapReduceBase
implements Mapper<LongWritable, Text , Text , IntWritable>
public class MaxTemperatureReducer extends MapReduceBase
implements Reducer、<Text , IntWritable , Text , IntWritable>
然后调用JobClient来执行:
JobConf conf = new JobConf(MaxTemperature.class);
conf.setJobName(MaxTemperature);
FilelnputFormat.addlnputPath(conf, new Path(ar、gs[e]));
FileOutputFormat.setOutputPath(conf, new Path(args(l]));
conf.setMapperClass(MaxTemperatureMapper.class);
conf.setReducerClass(MaxTemperatureReducer.class)';
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
JobClient.runJob(conf);
那么你实现的这两个类是怎么分发到各个节点上去执行呢。
在执行这个Job时,Hadoop会把你的两个class打成一个jar包,然后自动分发到各个节点上去。