1、真可谓步步该灾处处有难啊。。。
2、代码如下
package com.xx.hadoop.test.wordcount;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
/**
* Hadoop - 统计文件单词出现频次
* @author xxxx
*
*/
public class WordCount {
public static class WordCountMap extends
Mapper<LongWritable, Text, Text, IntWritable> {
private final IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
StringTokenizer token = new StringTokenizer(line);
while (token.hasMoreTokens()) {
word.set(token.nextToken());
context.write(word, one);
}
}
}
public static class WordCountReduce extends
Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values,
Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
context.write(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf);
job.setJarByClass(WordCount.class);
job.setJobName("wordcount");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(WordCountMap.class);
job.setReducerClass(WordCountReduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
2.1运行前,报错
import都报错,网上搜了搜,缺hadoop-common和hadoop-mapreduce-client-common的jar包,
从hadoop/share/hadoop/common/和hadoop/share/hadoop/mapreduce/中导入
2.2暂时看不到错误了,运行,报错。。。
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory
搜了搜,缺少commons-logging-xx.jar,找一个放上,继续运行
2.3报错
Exception in thread "main" java.lang.NoClassDefFoundError: com/google/common/base/Preconditions
继续搜,缺少google-collections-1.0.jar,导入
2.4报错
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/collections/map/UnmodifiableMap
缺少commons-collections-xx.jar,导入
2.5报错
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/Level
缺少log4j-xx.jar,导入
2.6报错
Exception in thread "main" java.lang.NoClassDefFoundError: com/google/common/collect/Interners
缺少guava的jar包,导入guava-18.0.jar
继续报错
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.collect.MapMaker.keyEquivalence(Lcom/google/common/base/Equivalence;)Lcom/google/common/collect/MapMaker;
发现是有个方法没找到,感觉是这个jar包过于超前了。重下一个老版本guava-r09.jar。解决
这里要多说一句就是,有的时候因为jar包之间版本混乱,所以最好降级使用。
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/lang/StringUtils
2.7报错
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
缺少slf4j-api-1.7.21.jar和slf4j-log4j12-1.6.2.jar。导入
2.8报错
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/configuration/Configuration
吐个槽,到这儿,我也是醉了。。。
缺少commons-configuration-xx.jar导入
2.9报错
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/lang/StringUtils
缺少commons-lang-xx.jar,这个可以在strtus中找到
2.10报错
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/util/PlatformName
缺少hadoop-auth-2.xx.jar,导入
2.11报错
Exception in thread "main" java.lang.NoClassDefFoundError: com/google/common/cache/CacheLoader
发现这里其实是之前第2.6步的两个版本guava选择的后遗症。如果使用18版本,那么会报之前没有方法的错
如果使用9版本,就会报这个错
到这里我想是否我的路线走错了?
回过头来,先冷静下,看看我的环境。我当前的环境是纯java1.7+hadoop-eclipse-plugin-2.2.jar搭配hadoop-2.8的数据库
我觉得问题可能在这里,于是我解压hadoop-eclipse-plugin-2.2.jar,查看里面的内容,发现好多lib包就是我今天一直在搜索的lib包,这更确定了我的思路有问题。因为我的plugin的jar包和 hadoop数据库环境的搭配问题
所以我从网上搜索一个hadoop-eclipse-plugin-2.8.jar的插件使用,看看效果如何