错误现象:单机安装的spark1.0.1版本:master可以正常启动 但是worker启动时报错 错误见下面log
下面是控制台的错误:
master: starting org.apache.spark.deploy.worker.Worker, logging to /home/hadoop/spark-1.0.1-bin-hadoop2/sbin/../logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-master.out
master: failed to launch org.apache.spark.deploy.worker.Worker:
master: at java.lang.ClassLoader.loadClass(libgcj.so.10)
master: at gnu.java.lang.MainThread.run(libgcj.so.10)
master: full log in /home/hadoop/spark-1.0.1-bin-hadoop2/sbin/../logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-master.out
log文件中的错误:
[hadoop@master spark-1.0.1-bin-hadoop2]$ cat logs/spark-hadoop-org.apache.spark.deploy.worker.Worker-1-master.out
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Spark Command: java -cp ::/home/hadoop/spark-1.0.1-bin-hadoop2/conf:/home/hadoop/spark-1.0.1-bin-hadoop2/lib/spark-assembly-1.0.1-hadoop2.2.0.jar:/home/hadoop/spark-1.0.1-bin-hadoop2/lib/datanucleus-rdbms-3.2.1.jar:/home/hadoop/spark-1.0.1-bin-hadoop2/lib/datanucleus-core-3.2.2.jar:/home/hadoop/spark-1.0.1-bin-hadoop2/lib/datanucleus-api-jdo-3.2.1.jar:/home/hadoop/hadoop/etc/hadoop -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms512m -Xmx512m org.apache.spark.deploy.worker.Worker spark://master:7077 --webui-port 8081
========================================
Exception in thread "main" java.lang.NoClassDefFoundError: org.apache.spark.deploy.worker.Worker
at gnu.java.lang.MainThread.run(libgcj.so.10)
Caused by: java.lang.ClassNotFoundException: akka.actor.Actor not found in gnu.gcj.runtime.SystemClassLoader{urls=[file:./,file:/home/hadoop/spark-1.0.1-bin-hadoop2/conf/,file:/home/hadoop/spark-1.0.1-bin-hadoop2/lib/spark-assembly-1.0.1-hadoop2.2.0.jar,file:/home/hadoop/spark-1.0.1-bin-hadoop2/lib/datanucleus-rdbms-3.2.1.jar,file:/home/hadoop/spark-1.0.1-bin-hadoop2/lib/datanucleus-core-3.2.2.jar,file:/home/hadoop/spark-1.0.1-bin-hadoop2/lib/datanucleus-api-jdo-3.2.1.jar,file:/home/hadoop/hadoop/etc/hadoop/], parent=gnu.gcj.runtime.ExtensionClassLoader{urls=[], parent=null}}
at java.net.URLClassLoader.findClass(libgcj.so.10)
at java.lang.ClassLoader.loadClass(libgcj.so.10)
at java.lang.ClassLoader.loadClass(libgcj.so.10)
at java.lang.VMClassLoader.defineClass(libgcj.so.10)
at java.lang.ClassLoader.defineClass(libgcj.so.10)
at java.security.SecureClassLoader.defineClass(libgcj.so.10)
at java.net.URLClassLoader.findClass(libgcj.so.10)
at java.lang.ClassLoader.loadClass(libgcj.so.10)
at java.lang.ClassLoader.loadClass(libgcj.so.10)
at gnu.java.lang.MainThread.run(libgcj.so.10)
开始以为是libgcj的错误 更新了libgcj之后还是报错
最后发现问题是系统自带的jdk没有卸载干净 于是卸载之后worker可以正常启动
[root@master yum.repos.d]# echo $JAVA_HOME
/usr/java/jdk
[root@master yum.repos.d]# echo $PATH
/usr/lib64/qt-3.3/bin:/usr/java/jdk/bin:/usr/java/jdk/jre/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
[root@master yum.repos.d]# rpm -qa | grep java
gcc-java-4.4.7-4.el6.x86_64
java_cup-0.10k-5.el6.x86_64
java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64
tzdata-java-2013g-1.el6.noarch
[root@master yum.repos.d]# rpm -qa | grep jdk
jdk-1.7.0_65-fcs.x86_64
[root@master yum.repos.d]# rpm -e --nodeps java_cup-0.10k-5.el6.x86_64
[root@master yum.repos.d]# rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64
[root@master yum.repos.d]#
或者是在spark-env.sh里边明确指定java_home