MapReduce
WordCountDriver案例:
driver原代码:
public class WordCountDriver {
public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
//driver固定的7步
//1 获取配置信息以及job对象
Configuration configuration = new Configuration();
Job job = Job.getInstance(configuration);
//2 设置jar包路径
job.setJarByClass(WordCountDriver.class);
//3 关联mapper和reducer
job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);
//4 设置map输出的kv类型
job.setMapOutputKeyClass(Text.class);
job.setMapOutputKeyClass(IntWritable.class);//这里出错了
//5 设置最终输入的kv类型
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
//6 设置输入路径和输出路径
//输入路径
FileInputFormat.setInputPaths(job, new Path("D:\\HadoopTestFile\\wordcounttest"));
//输出路径
FileOutputFormat.setOutputPath(job, new Path("D:\\HadoopTestFile\\wordcounttest\\wordcount"));
//7 提交job
boolean result = job.waitForCompletion(true);
System.exit(result ? 0 : 1);
}
}
抛出异常:
java.lang.Exception: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.IntWritable, received org.apache.hadoop.io.Text
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.IntWritable, received org.apache.hadoop.io.Text
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1088)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:727)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at com.kuber.mapreduce.wordcount.WordCountMapper.map(WordCountMapper.java:34)
at com.kuber.mapreduce.wordcount.WordCountMapper.map(WordCountMapper.java:17)
这里很明显是这里写错,低级错误
修改后为:
再次输出:
输出路径里面有信息,出现了成功文件,查看part-r-00000文件
成功解决!