在 MapReduce 报数组越界问题:
在编写MapReduce 的代码块的时候报数组越界,主要是因为在处理大量的数据的时候,不能100%确定这大量数据中不存在空行,所以必须要先判断是否为空:
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 7
at Market.Market_Mapper.map(Market_Mapper.java:24)
at Market.Market_Mapper.map(Market_Mapper.java:10)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2019-09-06 22:09:32,244 INFO mapreduce.Job: Job job_local1935223179_0001 running in uber mode : false
2019-09-06 22:09:32,245 INFO mapreduce.Job: map 0% reduce 0%
2019-09-06 22:09:32,246 INFO mapreduce.Job: Job job_local1935223179_0001 failed with state FAILED due to: NA
2019-09-06 22:09:32,258 INFO mapreduce.Job: Counters: 0
false
做是否为空判断:
我的原来报错的代码:(就是把两个值没做判断直接给了key和value)
String[] str = line.split("\t");
Text k =new Text();
IntWritable v=new IntWritable();
k.set(str[7]);
v.set(1);
context.write(k,v);
要做不为空的判断,对的代码:
if( str.length>7 && !str[7].isEmpty() && !str[6].isEmpty())
{
k.set(str[7]);
v.set(1);
context.write(k,v);
}
最好用 !.isEmpty() 这种判断方式,不要用别的!