声明版本号:
hadoop: apache 2.2.0
spark: 0.9.1
shark: 0.9.1
hive: 0.11.0
shark官网:http://shark.cs.berkeley.edu/
shark on cluster 文档:https://github.com/amplab/shark/wiki/Running-Shark-on-a-Cluster
按照文档进行配置,最后启动shark,出现以下问题:
Exception in thread "main" org.apache.spark.SparkException: YARN mode not available ?
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1275)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:201)
at shark.SharkContext.<init>(SharkContext.scala:42)
at shark.SharkContext.<init>(SharkContext.scala:61)
at shark.SharkEnv$.initWithSharkContext(SharkEnv.scala:78)
at shark.SharkEnv$.init(SharkEnv.scala:38)
at shark.SharkCliDriver.<init>(SharkCliDriver.scala:278)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:162)
at shark.SharkCliDriver.main(SharkCliDriver.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.scheduler.cluster.YarnClientClusterScheduler
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:190)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1269)
... 8 more
自己猜测应该是 SPARK_ASSEMBLY_JAR 没有加载,通过追代码,确实是这个问题:
在 $SHARK_HOME/run 脚本中加入下面的代码:
if [ -f "$SPARK_JAR" ] ; then
SPARK_CLASSPATH+=":$SPARK_JAR"
echo "SPARK CLASSPATH : "$SPARK_CLASSPATH
fi
但是又出现下面的问题:
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.getClient(RpcClientFactoryPBImpl.java:79)
at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy(HadoopYarnProtoRPC.java:48)
at org.apache.hadoop.yarn.client.RMProxy$1.run(RMProxy.java:134)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
at org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:130)
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:93)
at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:70)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:114)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.spark.deploy.yarn.Client.runApp(Client.scala:76)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:78)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:126)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:202)
at shark.SharkContext.<init>(SharkContext.scala:42)
at shark.SharkContext.<init>(SharkContext.scala:61)
at shark.SharkEnv$.initWithSharkContext(SharkEnv.scala:78)
at shark.SharkEnv$.init(SharkEnv.scala:38)
at shark.SharkCliDriver.<init>(SharkCliDriver.scala:278)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:162)
at shark.SharkCliDriver.main(SharkCliDriver.scala)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)