Mahout算法解析与案例实战P32页的代码编译运行报错,提示找不到类:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/mahout/common/AbstractJob
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: org.apache.mahout.common.AbstractJob
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 15 more
页末提示中接解决方法有误,这是由于在最新的hadoop里,外来的jar包西药复制到$HADOOP_HOME/share/hadoop/commen/lib中;
百度了半天,最后在http://bbs.csdn.net/topics/390716734 里4楼tntzbzc的回复里找到答案了。。其实去看sbin/XX.sh里的代码配置也能找出来原因
加入了jar包有:commons-cli-2.0-mahout.jar
mahout-hdfs-0.12.2.jar
mahout-math-0.12.2.jar
mahout-mr-0.12.2.jar
加入后错误提示改变了:16/11/18 17:13:40 ERROR common.AbstractJob: Unexpected mahout.fansy.utils.transfrom.Text2VectorWritable while processing Job-Specific Options:
Unexpected mahout.fansy.utils.transfrom.Text2VectorWritable while processing
Job-Specific Options:
Usage:
[--input <input> --output <output> --help --tempDir <tempDir> --startPhase
<startPhase> --endPhase <endPhase>]
Job-Specific Options:
--input (-i) input Path to job input directory.
--output (-o) output The directory pathname for output.
--help (-h) Print out help
--tempDir tempDir Intermediate output directory
--startPhase startPhase First phase to run
--endPhase endPhase Last phase to run
解决方法:原命令/usr/local/hadoop/bin/hadoop jar /home/hadoop/mahout_jar/ClusteringUtils.jar mahout.fansy.utils.transfrom.Text2VectorWritable -i input/synthetic_control.data -o input/transform
如果不去掉mahout.fansy.utils.transfrom.Text2VectorWritable,出现这个报错
如果去掉,则能运行,但是会死机。(后发现是代码问题,get不到实例,服务器地址有误,但是命令行不会知道,所以还是elipse大法好)
所以根据书上的流程和代码是有问题的,自己修改input和output目录
手动设置input和output的绝对路径如下:
Configuration conf=new Configuration();
conf.set("fs.default.name", "hdfs://127.0.0.1:9000");
String[] ioArgs=new String[]{"/user/hadoop/input","/user/hadoop/output"};
String[] otherArgs=new GenericOptionsParser(conf,ioArgs).getRemainingArgs();
if(otherArgs.length!=2){
System.err.println("Usage:Allocation<in><out>");
System.exit(2);
}
报错3:如果写"localhost:9000",会报Relative path in absolute URI: localhost:9000错,所以要改成绝对路径
PS:MD这本书真的神坑,实力跳过各种细节,还有一堆错- -