执行MR程序的时候发生异常:java.lang.Exception: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.LongWritable
日志如下:
- 2016-05-09 21:33:28,871 INFO [org.apache.hadoop.conf.Configuration.deprecation] - session.id is deprecated. Instead, use dfs.metrics.session-id
- 2016-05-09 21:33:28,873 INFO [org.apache.hadoop.metrics.jvm.JvmMetrics] - Initializing JVM Metrics with processName=JobTracker, sessionId=
- 2016-05-09 21:33:29,309 WARN [org.apache.hadoop.mapreduce.JobResourceUploader] - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
- 2016-05-09 21:33:29,494 INFO [org.apache.hadoop.mapreduce.lib.input.FileInputFormat] - Total input paths to process : 1
- 2016-05-09 21:33:29,584 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - number of splits:1
- 2016-05-09 21:33:29,679 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - Submitting tokens for job: job_local1411634813_0001
- 2016-05-09 21:33:29,890 INFO [org.apache.hadoop.mapreduce.Job] - The url to track the job: http://localhost:8080/
- 2016-05-09 21:33:29,891 INFO [org.apache.hadoop.mapreduce.Job] - Running job: job_local1411634813_0001
- 2016-05-09 21:33:29,892 INFO [org.apache.hadoop.mapred.LocalJobRunner] - OutputCommitter set in config null
- 2016-05-09 21:33:29,901 INFO [org.apache.hadoop.mapred.LocalJobRunner] - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
- 2016-05-09 21:33:30,000 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Waiting for map tasks
- 2016-05-09 21:33:30,000 INFO [org.apache.hadoop.mapred.LocalJobRunner] - Starting task: attempt_local1411634813_0001_m_000000_0
- 2016-05-09 21:33:30,035 INFO [org.apache.hadoop.yarn.util.ProcfsBasedProcessTree] - ProcfsBasedProcessTree currently is supported only on Linux.
- 2016-05-09 21:33:30,081 INFO [org.apache.hadoop.mapred.Task] - Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@546dbb34
- 2016-05-09 21:33:30,088 INFO [org.apache.hadoop.mapred.MapTask] - Processing split: hdfs://192.168.5.97:8020/tmp/htb/mr/join_in/child-parent.txt:0+161
- 2016-05-09 21:33:30,144 INFO [org.apache.hadoop.mapred.MapTask] - (EQUATOR) 0 kvi 26214396(104857584)
- 2016-05-09 21:33:30,144 INFO [org.apache.hadoop.mapred.MapTask] - mapreduce.task.io.sort.mb: 100
- 2016-05-09 21:33:30,144 INFO [org.apache.hadoop.mapred.MapTask] - soft limit at 83886080
- 2016-05-09 21:33:30,144 INFO [org.apache.hadoop.mapred.MapTask] - bufstart = 0; bufvoid = 104857600
- 2016-05-09 21:33:30,145 INFO [org.apache.hadoop.mapred.MapTask] - kvstart = 26214396; length = 6553600
- 2016-05-09 21:33:30,148 INFO [org.apache.hadoop.mapred.MapTask] - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
- 2016-05-09 21:33:30,462 INFO [org.apache.hadoop.mapred.MapTask] - Starting flush of map output
- 2016-05-09 21:33:30,479 INFO [org.apache.hadoop.mapred.LocalJobRunner] - map task executor complete.
- 2016-05-09 21:33:30,503 WARN [org.apache.hadoop.mapred.LocalJobRunner] - job_local1411634813_0001
- java.lang.Exception: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.LongWritable</span>
- at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
- at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
- Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.LongWritable</span>
- at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1069)
- at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:712)
- at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
- at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
- at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
- at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
- at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
- at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
- at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
- at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
- at java.util.concurrent.FutureTask.run(Unknown Source)
- at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
- at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
- at java.lang.Thread.run(Unknown Source)
- 2016-05-09 21:33:30,895 INFO [org.apache.hadoop.mapreduce.Job] - Job job_local1411634813_0001 running in uber mode : false
- 2016-05-09 21:33:30,896 INFO [org.apache.hadoop.mapreduce.Job] - map 0% reduce 0%
- 2016-05-09 21:33:30,898 INFO [org.apache.hadoop.mapreduce.Job] - Job job_local1411634813_0001 failed with state FAILED due to: NA
- 2016-05-09 21:33:30,903 INFO [org.apache.hadoop.mapreduce.Job] - Counters: 0
Map函数代码如下
- public static class Map extends Mapper {
- // 实现map函数
- public void map(Object key, Text value, Context context)
- throws IOException, InterruptedException {
- String childname = new String();
- String parentname = new String();
- String relationtype = new String();
- // 输入的一行预处理文本
- StringTokenizer itr = new StringTokenizer(value.toString());
- String[] values = new String[2];
- int i = 0;
- while (itr.hasMoreTokens()) {
- values[i] = itr.nextToken();
- i++;
- }
- if (values[0].compareTo("child") != 0) {
- childname = values[0];
- parentname = values[1];
- // 输出左表
- relationtype = "1";
- context.write(new Text(values[1]), new Text(relationtype +
- "+" + childname + "+" + parentname));
- // // 输出右表
- relationtype = "2";
- context.write(new Text(values[0]), new Text(relationtype +
- "+" + childname + "+" + parentname));
- }
- }
- }
- public static void main(String[] args) throws Exception {
- Configuration conf = new Configuration();
- String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
- if (otherArgs.length != 2) {
- System.err.println("Usage: Single Table Join <in> [<in>...] <out> ");
- System.exit(2);
- }
- Job job = Job.getInstance(conf, "Single Table Join");
- job.setJarByClass(STjoin.class);
- // 设置Map和Reduce处理类
- job.setMapperClass(Map.class);
- job.setReducerClass(Reduce.class);
- // 设置输出类型
- job.setOutputKeyClass(Text.class);
- job.setOutputValueClass(Text.class);
- // 设置输入和输出目录
- FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
- <span style="white-space:pre"> </span>FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
- System.exit(job.waitForCompletion(true) ? 0 : 1);
- }
解决方法:
这个错误信息已经显示,要求的输出的KEY是TEXT类型,但返回LongWritable类型
查看一下设置map的输出类型,然后再看看Map函数输出的参数是否和它保持一致。
这里设置的是输出key和value都是Text类型
- job.setOutputKeyClass(Text.class);
- job.setOutputValueClass(Text.class);
但是继承Mapper函数的时候,没有指定输出参数类型,因为默认的输入格式TextInputFormat产生的键类型是LongWritable,说到这里应该知道怎么改了吧。还不知道怎么改的,参考如下
- public static class Map extends Mapper<Object, Text, Text, Text>