首先下载并且安装
spark
IDEA
scala
配置环境变量
vim /etc/profile 的配置
export JAVA_HOME=/usr/java/jdk1.7.0_60export HADOOP_HOME=/itcast/hadoop-2.2.0
export apache=/java/apache-tomcat-7.0.27
export SCALA_HOME=/itcast/scala-2.10.5
export SPARK_HOME=/itcast/spark-1.3.0-bin-hadoop2.4
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$apache/bin:$SCALA_HOME/bin:$SPARK_HOME/bin
spark conf 配置文件 spark-env.sh.template 修改spark-env.sh
export SCALA_HOME=/itcast/scala-2.10.5
export JAVA_HOME=/usr/java/jdk1.7.0_60
export SPARK_MASTER_IP=192.168.1.118
export SPARK_WORKER_MEMORY=3000m
export master=spark://192.168.1.118:7070
slaves 配置worker 工作节点
由于采用本地模式,因此我们master 跟worker 节点在同一台机器上。
192.168.1.118
启动spark
然后通过浏览器查看
http://192.168.1.118:8080
(2) idea 打包jar 文件。
idea--file---project-Structure--Artifctor 新建打入jar.
(3)在 提交任务到集群java no pointException 解决方法。
conf.setAppName("WorldCount")
conf.setMaster("spark://192.168.1.118:7077")
(4)解决如下错误 job has not accepted any resources; check your cluster UI to
15/03/26 22:29:36 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/03/26 22:29:51 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/03/26 22:30:06 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 15/03/26 22:30:21 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
从警告信息上看,初始化job时没有获取到任何资源;提示检查集群,确保workers可以被注册并有足够的内存资源。
如上问题产生的原因是多方面的,可能原因如下:
1.因为提交任务的节点不能和spark工作节点交互,因为提交完任务后提交任务节点上会起一个进程,展示任务进度,大多端口为4044,工作节点需要反馈进度给该该端口,所以如果主机名或者IP在hosts中配置不正确。所以检查下主机名和ip是否配置正确
2.有可能是内存不足
检查内存
conf.set("spark.executor.memory", "3000m")
Make sure to set SPARK_LOCAL_IP andSPARK_MASTER_IP.
查看8080端口,确保一些workers保持Alive状态,确保 some cores 是可利用的
注意本次解决方式是:
export SPARK_WORKER_MEMORY=3000m 增加spark 内存。
slaves 文件中把loclalhost改为 ip 地址。
最后提交任务成功。
(5) IDEA 工具本地测试遇到问题 exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration
xception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:185)
at SparkDemo.SimpleApp$.main(SimpleApp.scala:13)
at SparkDemo.SimpleApp.main(SimpleApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Using Spark‘s default log4j profile: org/apache/spark/log4j-defaults.propertie
再次运行,依旧出错:
Exception in thread "main" java.lang.NoSuchMethodError: scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet;
at akka.actor.ActorCell$.<init>(ActorCell.scala:336)
at akka.actor.ActorCell$.<clinit>(ActorCell.scala)
at akka.actor.RootActorPath.$div(ActorPath.scala:159)
at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:464)
at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:452)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$2.apply(DynamicAccess.scala:78)
at scala.util.Try$.apply(Try.scala:191)
at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:73)
at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
at scala.util.Success.flatMap(Try.scala:230)
at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:84)
at akka.actor.ActorSystemImpl.liftedTree1$1(ActorSystem.scala:584)
at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:577)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:108)
at akka.Akka$.delayedEndpoint$akka$Akka$1(Akka.scala:11)
at akka.Akka$delayedInit$body.apply(Akka.scala:9)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:383)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at akka.Akka$.main(Akka.scala:9)
at akka.Akka.main(Akka.scala)
之前的一篇博客提到过,我安装的Scala版本为2.11.5,Spark版本为1.2.0,看来Spark版本和Scala版本还是存在一些兼容性问题,将Scala改为2.10.4问题就解决了,程序运行的结
(6)INFO AppClient$ClientActor: Executor updated: app-20151221220543-0003/29 is
15/12/21 22:06:01 ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 18
15/12/21 22:06:01 INFO AppClient$ClientActor: Executor added: app-20151221220543-0003/19 on worker-20151221042816-itcastdd-56547 (itcastdd:56547) with 2 cores
15/12/21 22:06:01 INFO SparkDeploySchedulerBackend: Granted executor ID app-20151221220543-0003/19 on hostPort itcastdd:56547 with 2 cores, 2.9 GB RAM
15/12/21 22:06:01 INFO AppClient$ClientActor: Executor updated: app-20151221220543-0003/19 is now LOADING
15/12/21 22:06:01 INFO AppClient$ClientActor: Executor updated: app-20151221220543-0003/19 is now RUNNING
15/12/21 22:06:02 INFO AppClient$ClientActor: Executor updated: app-20151221220543-0003/19 is now EXITED (Command exited with code 1)
15/12/21 22:06:02 INFO SparkDeploySchedulerBackend: Executor app-20151221220543-0003/19 removed: Command exited with code 1
15/12/21 22:06:02 ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 19
15/12/21 22:06:02 INFO AppClient$ClientActor: Executor added: app-20151221220543-0003/20 on worker-20151221042816-itcastdd-56547 (itcastdd:56547) with 2 cores
15/12/21 22:06:02 INFO SparkDeploySchedulerBackend: Granted executor ID app-20151221220543-0003/20 on hostPort itcastdd:56547 with 2 cores, 2.9 GB RAM
15/12/21 22:06:02 INFO AppClient$ClientActor: Executor updated: app-20151221220543-0003/20 is now RUNNING
15/12/21 22:06:02 INFO AppClient$ClientActor: Executor updated: app-20151221220543-0003/20 is now LOADING
15/12/21 22:06:02 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
15/12/21 22:06:02 INFO AppClient$ClientActor: Executor updated: app-20151221220543-0003/20 is now EXITED (Command exited with code 1)
15/12/21 22:06:02 INFO SparkDeploySchedulerBackend: Executor app-20151221220543-0003/20 removed: Command exited with code 1
15/12/21 22:06:02 ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor 20
Got status update for unknown executor app-20150509185326-0001/11158
解决Spark集群重复出现SparkDeploySchedulerBackend: Asked to remove non-existent executor的问题
直接切入正题,spark版本1.3.1,出现这个问题的原因是因为spark上执行的任务异常终止导致的。
我碰到这个问题是因为hive的metastore出现了故障导致很多任务失败了,然后重启spark后,将应用到部署到集群上运行会应用端重复出现这样的错误:
解决方案
停掉spark集群
删除spark集群各节点上spark相关的临时文件,默认位于/tmp/spark-*,具体路径根据spark-env.sh中的SPARK_LOCAL_DIRS配置而定。
启动spark集群
注意,必须要停掉spark后在删除临时文件才行,不然删除时文件正在使用会出错,删除不成功。
现这个错误的原因还有可能是集群启动的有问题,比如一台服务器上跑了2个Worker进程(可能其中一个Worker没有杀掉),使用jps命令查看会看到大量的类似如下的进程信息:
17048 -- process information unavailable
12914 -- process information unavailable
14540 -- process information unavailable
13579 -- process information unavailable
16809 -- process information unavailable
出现这种现象时,直接把有问题的Worker进程kill掉就行了。
问题排查技巧
放开Spark的日志级别,spark的log4j的默认日志级别是WARN,可以修改为INFO(修改后无须重启Spark),然后使用spark-sql --master spark://your-master-ip:7077去测试,分析日志信息,也能辅助排查错误。
***************************************************************************************************************************************************************** (6)Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:185)
at SparkDemo.SimpleApp$.main(SimpleApp.scala:13)
at SparkDemo.SimpleApp.main(SimpleApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Using Spark‘s default log4j profile: org/apache/spark/log4j-defaults.propertie
解决办法:
方法(1)在IDE中点击Run -> Edit Configuration,在右侧VM options中输入“-Dspark.master=local”,(指示本程序本地单线程运行)
方法(2)在代码中 conf.setMaster("spark://192.168.1.118:7070")
(7)再次运行,依旧出错:scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet;
Exception in thread "main" java.lang.NoSuchMethodError: scala.collection.immutable.HashSet$.empty()Lscala/collection/immutable/HashSet;
at akka.actor.ActorCell$.<init>(ActorCell.scala:336)
at akka.actor.ActorCell$.<clinit>(ActorCell.scala)
at akka.actor.RootActorPath.$div(ActorPath.scala:159)
at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:464)
at akka.actor.LocalActorRefProvider.<init>(ActorRefProvider.scala:452)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$2.apply(DynamicAccess.scala:78)
at scala.util.Try$.apply(Try.scala:191)
at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:73)
at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
at akka.actor.ReflectiveDynamicAccess$$anonfun$createInstanceFor$3.apply(DynamicAccess.scala:84)
at scala.util.Success.flatMap(Try.scala:230)
at akka.actor.ReflectiveDynamicAccess.createInstanceFor(DynamicAccess.scala:84)
at akka.actor.ActorSystemImpl.liftedTree1$1(ActorSystem.scala:584)
at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:577)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:141)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:108)
at akka.Akka$.delayedEndpoint$akka$Akka$1(Akka.scala:11)
at akka.Akka$delayedInit$body.apply(Akka.scala:9)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:383)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at akka.Akka$.main(Akka.scala:9)
at akka.Akka.main(Akka.scala)
首先这个错误是版本不兼容问题导致的(本地测试遇到的),
scala2.11.7 spark1.3
scala2.11.0 spark1.3
scala2.11.5 spark1.3
解决方案
scala 2.10.1 spark 1.4.1 IDEA 测试成功,spark 集群测试成功。
*****************************************************************************************************************************************************************
(8) java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
解决方法是
https://github.com/srccodes/hadoop-common-2.2.0-bin 下载这个包
https://github.com/srccodes/hadoop-common-2.2.0-bin/tree/master/bin
下载后将所有文件都拷贝到hadoop的bin 目录下。
然后在window/system32 方一份
然后在idea main 方法第一行放入
System.setProperty("hadoop.home.dir", "d:\\hadoop-2.2.0")
然后运行成功。
(9) 解决Exception in thread "main" org.apache.hadoop.security.AccessControlException: Permission denied: user=yw, access=EXECUTE, inode="/yw/b.txt":root:supergroup:-rw-r--r--
设置权限问题。
因为IDEA 使用hadoop插件提交作业时,会默认以administrator身份去将作业写入hdfs文件系统中,对应的也就是 HDFS 上的/user/xxx , 我的为/user/hadoop , 由于 administrator 用户对hadoop目录并没有写入权限,所以导致异常的发生
。解决方法为:
方法1 在 hdfs-site.xml 总添加参数:
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
方法2 设置文件拥有所有读取权限。
放开 hadoop 目录的权限 , 命令如下 :
$hadoop fs -chmod 777 /yv
$hadoop fs -chmod -R 777 /yv
**************************************************************************************************************************************************************
(10)
java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSums(IILjava/nio/ByteBuffer;ILjava/nio/ByteBuffer;IILjava/lang/String;JZ)V
in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 6, 192.168.1.118): java.lang.ClassNotFoundException: com.hq.WorldCount$$anonfun$main$2
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
运行ieda 本机测试并加入本机测试抛出如上错误。
解决方法:
(1) 代码中加入 conf.setMaster("spark://192.168.1.118:7077")
注意:如果是IDEA 调试用青修改“run--editor configurations---option 选项中输入Dspark.master=loclal” 就可以了。
(2)如果还是解决不行,可以尝试一下配置